This post is an update to the status of various MySQL bugs (some with patches) that I’ve filed over the past couple of years (or that people around me have). I’m not looking at POWER specific ones, as there are these too, but each of these bugs here deal with general correctness of the code base.
Firstly, let’s look at some points I’ve raised:
Incorrect locking for global_query_id (bug #72544)
Raised on May 5th, 2014 on the internals list. As of today, no action (apart from Dimitri verifying the bug back in May 2014). There continues to be locking that perhaps only works by accident around query IDs.Soon, this bug will be two years old.
Endian code based on CPU type rather than endian define (bug #72715)
About six-hundred and fifty days ago I filed this bug – back in May 2014, which probably has a relatively trivial fix of using the correct #ifdef of BIG_ENDIAN/LITTLE_ENDIAN rather than doing specific behavior based on #ifdef __i386__
What’s worse is that this looks like somebody being clever for a compiler in the 1990s, which unlikely ends up with the most optimal code today.
mysql-test-run.pl –valgrind-all does not run all binaries under valgrind (bug #74830)
Yeah, this should be a trivial fix, but nothing has happened since November 2014.
I’m unlikely to go provide a patch simply because it seems to take sooooo damn long to get anything merged.
MySQL 5.1 doesn’t build with Bison 3.0 (bug #77529)
Probably of little consequence, unless you’re trying to build MySQL 5.1 on a linux distro released in the last couple of years. Fixed in Maria for a long time now.
Incorrect function name in DBUG_ENTER (bug #78133)
Pull request number 25 on github – a trivial patch that is obviously correct, simply correcting some debug only string.
So far, over 191 days with no action. If you can’t get trivial and obvious patches merged in about 2/3rds of a year, you’re not going to grow contributions. Nearly everybody coming to a project starts out with trivial patches, and if a long time contributor who will complain loudly on the internet (like I am here) if his trivial patches aren’t merged can’t get it in, what chance on earth does a newcomer have?
In case you’re wondering, this is the patch:
InnoDB table flags in bitfield is non-optimal (bug #74831)
With a patch since I filed this back in November 2014, it’s managed to sit idle long enough for GCC 4.8 to practically disappear from anywhere I care about, and 4.9 makes better optimization decisions. There are other reasons why C bitfields are an awful idea too.
Actually complex issues:
InnoDB mutex spin loop is missing GCC barrier (bug #72755)
Again, another bug filed back in May 2014, where InnoDB is doing a rather weird trick to attempt to get the compiler to not optimize away a spinloop. There’s a known good way of doing this, it’s called a compiler barrier. I’ve had a patch for nearly two years, not merged :(
buf_block_align relies on random timeouts, volatile rather than memory barriers (bug #74775)
This bug was first filed in November 2014 and deals with a couple of incorrect assumptions about memory ordering and what volatile means.
While this may only exhibit a problem on ARM and POWER processors (as well as any other relaxed memory ordering architectures, x86 is the notable exception), it’s clearly incorrect and very non-portable.
Don’t expect MySQL 5.7 to work properly on ARM (or POWER). Try this:
You’ll likely find MySQL > 5.7.5 still explodes.
In fact, there’s also Bug #79378 which Alexey Kopytov filed with patch that’s been sitting idle since November 2015 which is likely related to this bug.
Not covered here: universal CRC32 hardware acceleration (rather than just for innodb data pages) and other locking issues (some only recently discovered). I also didn’t go into anything filed in December 2015… although in any other project I’d expect something filed in December 2015 to have been looked at by now.
Like it or not, MySQL is still upstream for all the MySQL derivatives active today. Maybe this will change as RocksDB and TokuDB gain users and if WebScaleSQL, MariaDB and Percona can foster a better development community.
I wonder how much longer the ARCHIVE storage engine is going to ship with MySQL…. I think I’m the last person to actually fix a bug in it, and that was, well, a good number of years ago now. It was created to solve a simple problem: write once read hardly ever. Useful for logs and the like. A zlib stream of rows in a file.
You can actually easily beat ARCHIVE for INSERT speed with a non-indexed MyISAM table, and with things like TokuDB around you can probably get pretty close to compression while at the same time having these things known as “indexes”.
ARCHIVE for a long time held this niche though and was widely and quietly used (and likely still is). It has the great benefit of being fairly lightweight – it’s only about 2500 lines of code (1130 if you exclude azio.c, the slightly modified gzio.c from zlib).
It also use the table discovery mechanism that NDB uses. If you remove the FRM file for an ARCHIVE table, the ARCHIVE storage engine will extract the copy it keeps to replace it. You can also do consistent backups with ARCHIVE as it’s an append-only engine. The ARCHIVE engine was certainly the simplest example code of this and a few other storage engine API things.
I’d love to see someone compare storage space and performance of ARCHIVE against TokuDB and InnoDB (hint hint, the Internet should solve this for me).
In this case, I really don’t mind. It’s rather exciting that they’ve gone ahead and done this – and it’s not just a code drop: https://github.com/Tokutek/ft-engine is where things are at, and recent commits were 2hrs and 18hrs ago which means that this is being maintained. This is certainly a good way to grow a developer community.
While being a MySQL engine is really interesting, the MongoDB integration is certainly something to watch – and may be something quite huge for Tokutek (after all, it’s the database engine parts of Mongo that are the most troublesome – along with the client library license).
Over the years there have been many storage engines crop up and then disappear. So… where are they now?
This became MyISAM…. you know you’ve been around MySQL a long time if you’ve ever had to deal with an ISAM table.
Gemini This was the first big test of the GPL in court. Basically, you have to obey the GPL (see wikipedia for more info). The code was released as GPL and development stopped. This has been dead since ca 2002.
otherwise known as the BerkeleyDB engine. It was seldom used and never gained much of a userbase. It was unceremoniously dropped back in 2006 and both users didn’t really exist.
PBXT – http://pbxt.blogspot.com/ I think we can credit PBXT with at least half of the features and performance improvements to InnoDB since it first emerged back in 2006. It got attention very quickly. Why? Because it was different. It had the very rare ability to outperform InnoDB in some places. You can still find PBXT in MariaDB, but sadly it can be hard to fund development of a MySQL storage engine, especially one as tied to MySQL as PBXT is, and it’s no longer under active development. Closely related was the Blob Streaming project which was way ahead of its time as an AlsoSQL access method. The good news is that the code was released under a BSD license in 2012 (was previously GPL). We even had PBXT in Drizzle for a while.
Blob Streaming (PBMS) – http://bpbdev.blogspot.com/ This project was closely related to (but not depending exclusively on) PBXT. It embedded a HTTP server inside the database and could use it to read and write BLOBs. This was not only fairly cool but way ahead of its time. We owe the existence of both HandlerSocket and the memcached interface to InnoDB to PBMS (it was also an inspiration for the JSON server plugin for Drizzle, to address some of the use cases of the PBMS plugin).
It’s still there… but is effectively unmaintained and dead. There’s even FederatedX in MariaDB which is an improvement, but still, the MySQL server really doesn’t lend itself kindly to this type of engine… it’s always been an oddity only suitable for very specific tasks.
Archive Although useful, effectively unmaintained. I kinda don’t want to say dead… but if it went away, I wouldn’t exactly be surprised.
Currently used to access the log tables in MySQL… and hardly used otherwise. It’s odd that the same code doesn’t deal with SELECT INTO OUTFILE and LOAD DATA INFILE, and I doubt this will ever change. I’d say effectively niche/dead.
TokuDB I cannot emphasize how much more interesting TokuDB would be if it were open source. It actually holds some promise… and with their recent work with mongo, perhaps this is a good way forward for them…
Another “OMG Oracle just bought Innobase Oy” engine. This was a project to take MyISAM and turn it into a lean, mean, transactional storage engine machine. It’s still not there and I don’t think it ever will be.
This was the hot new thing. It came out of Netfrastructure, which MySQL AB acquired in order to help get a transactional storage engine after Innobase Oy was acquired by Oracle. If you’re keeping count, that’s three projects for a transactional storage engine. Falcon was the star though, receiving all the press and publicity (well before it was ready). There are many reasons why Falcon isn’t around today – the chief one probably being that Oracle bought Sun who had bought MySQL and thus a need for an “InnoDB replacement” instantly vanished. There was also immense management pressure for performance to be greater than InnoDB, without any allowance for or focus on correctness…. and this showed. This was quite disappointing as Falcon had a lot of good architectural things going for it.
I think this is all the notable engines that were aimed at widespread adoption… what ones have I forgotten?
It’s interesting to note that only Archive, CSV, Xeround, TokuDB and Infobright can be gotten anywhere, and the latter two only in their own distribution (one proprietary) and Xeround only as a service.