MySQL Contributions status

This post is an update to the status of various MySQL bugs (some with patches) that I’ve filed over the past couple of years (or that people around me have). I’m not looking at POWER specific ones, as there are these too, but each of these bugs here deal with general correctness of the code base.

Firstly, let’s look at some points I’ve raised:

  • Incorrect locking for global_query_id (bug #72544)
    Raised on May 5th, 2014 on the internals list. As of today, no action (apart from Dimitri verifying the bug back in May 2014). There continues to be locking that perhaps only works by accident around query IDs.Soon, this bug will be two years old.
  • Endian code based on CPU type rather than endian define (bug #72715)
    About six-hundred and fifty days ago I filed this bug – back in May 2014, which probably has a relatively trivial fix of using the correct #ifdef of BIG_ENDIAN/LITTLE_ENDIAN rather than doing specific behavior based on #ifdef __i386__
    What’s worse is that this looks like somebody being clever for a compiler in the 1990s, which unlikely ends up with the most optimal code today.
  • mysql-test-run.pl –valgrind-all does not run all binaries under valgrind (bug #74830)
    Yeah, this should be a trivial fix, but nothing has happened since November 2014.
    I’m unlikely to go provide a patch simply because it seems to take sooooo damn long to get anything merged.
  • MySQL 5.1 doesn’t build with Bison 3.0 (bug #77529)
    Probably of little consequence, unless you’re trying to build MySQL 5.1 on a linux distro released in the last couple of years. Fixed in Maria for a long time now.

Trivial patches:

  • Incorrect function name in DBUG_ENTER (bug #78133)
    Pull request number 25 on github – a trivial patch that is obviously correct, simply correcting some debug only string.
    So far, over 191 days with no action. If you can’t get trivial and obvious patches merged in about 2/3rds of a year, you’re not going to grow contributions. Nearly everybody coming to a project starts out with trivial patches, and if a long time contributor who will complain loudly on the internet (like I am here) if his trivial patches aren’t merged can’t get it in, what chance on earth does a newcomer have?
    In case you’re wondering, this is the patch:

    --- a/sql/rpl_rli_pdb.cc
    +++ b/sql/rpl_rli_pdb.cc
    @@ -470,7 +470,7 @@ bool Slave_worker::read_info(Rpl_info_handler *from)
     
     bool Slave_worker::write_info(Rpl_info_handler *to)
     {
    -  DBUG_ENTER("Master_info::write_info");
    +  DBUG_ENTER("Slave_worker::write_info");
  • InnoDB table flags in bitfield is non-optimal (bug #74831)
    With a patch since I filed this back in November 2014, it’s managed to sit idle long enough for GCC 4.8 to practically disappear from anywhere I care about, and 4.9 makes better optimization decisions. There are other reasons why C bitfields are an awful idea too.

Actually complex issues:

  • InnoDB mutex spin loop is missing GCC barrier (bug #72755)
    Again, another bug filed back in May 2014, where InnoDB is doing a rather weird trick to attempt to get the compiler to not optimize away a spinloop. There’s a known good way of doing this, it’s called a compiler barrier. I’ve had a patch for nearly two years, not merged :(
  • buf_block_align relies on random timeouts, volatile rather than memory barriers (bug #74775)
    This bug was first filed in November 2014 and deals with a couple of incorrect assumptions about memory ordering and what volatile means.
    While this may only exhibit a problem on ARM and POWER processors (as well as any other relaxed memory ordering architectures, x86 is the notable exception), it’s clearly incorrect and very non-portable.
    Don’t expect MySQL 5.7 to work properly on ARM (or POWER). Try this:

    ./mysql-test-run.pl rpl.rpl_checksum_cache --repeat=10

    You’ll likely find MySQL > 5.7.5 still explodes.
    In fact, there’s also Bug #79378 which Alexey Kopytov filed with patch that’s been sitting idle since November 2015 which is likely related to this bug.

Not covered here: universal CRC32 hardware acceleration (rather than just for innodb data pages) and other locking issues (some only recently discovered). I also didn’t go into anything filed in December 2015… although in any other project I’d expect something filed in December 2015 to have been looked at by now.

Like it or not, MySQL is still upstream for all the MySQL derivatives active today. Maybe this will change as RocksDB and TokuDB gain users and if WebScaleSQL, MariaDB and Percona can foster a better development community.

Towards (and beyond) ONE MILLION queries per second

At Percona Live MySQL Conference 2015 next week I’ll be presenting on “Towards One MILLION queries per second” on 14th April at 4:50pm in Ballroom A.

This is the story of work I’ve been doing to get MySQL executing ONE MILLION SQL queries per second. It involves tales of MySQL, tales of the POWER8 Processor and a general amount of fun in extracting huge amounts of performance.

As I speak, I’m working on some even more impressive benchmark results! New hardware, new MySQL versions and really breaking news on MySQL scalability.

and now for something completely different…

As many of you know, I’ve been working in the MySQL world for quite a while now. IN fact, it was nearly 10 years ago when I first started hacking on MySQL Cluster at MySQL AB.

Most recently, I was at Percona which was a wonderful journey where over my nearly three years there the company at least doubled in size, launched several new software products and greatly improved the quality and frequency of releases.

However the time has come for something completely different. The MySQL world is rather mature, the future of Percona software is bright and, well, I could do with poking into something rather different.

So a couple of weeks ago I started at IBM in the Linux Technology Centre working on KVM on POWER and related things. No doubt there’ll be interesting things to blog about as time goes on, but it’s about time I posted my change of employment :)

Hong Kong (OpenStack Summit)

I’ll be in Hong Kong for the upcoming OpenStack Summit Nov 5-8. I’d be thrilled to talk database things with others present, especially around Trove DBaaS (DataBase as a Service) and high availability MySQL for OpenStack deployments.

I was last in Hong Kong in 2010 when I worked for Rackspace. The closest office to me was in Hong Kong so that’s where I did my HR onboarding training. I remember telling friends on the Sunday night before leaving for Hong Kong that I may be able to make dinner later in the week purely depending on if somebody got back to me on if I was going to Hong Kong that week. I was, and I went. I took some photos while there.

Walking from the hotel where we were staying to the Rackspace office could be done pretty much entirely through buildings without going outside. There were bits of art around too, which is just kind of awesome – I’m always in favour of random art.
Statues in walkways

The photo below was the view from my hotel room. The OpenStack summit is just by the airport rather than in the middle of town, so the views will be decidedly different to this, but still probably quite spectacular if you’re around the right place (I plan to take camera gear, so shout if you want to journey too)
Hotel Window (Hong Kong)

There are some pretty awesome markets around Hong Kong offering just about everything you’d want, including a lot just out on the street.
Java Road
Hong Kong Street Market

Nightime was pretty awesome, having people from around the world journey out into the night was great.
Rackers walking Hong Kong at Night

I was there during the World Cup, and the streets were wonderfully decorated. I’m particularly proud of this photo as it was handheld, at night, after beer.
Hong Kong streetlife

The MEMORY storage engine

I recently wrote about Where are they now: MySQL Storage Engines and The MERGE storage engine: not dead, just resting…. or forgotten. Today, it’s the turn of the MEMORY storage engine – otherwise known as HEAP.

This is yet another piece of the MySQL server that sits largely unmaintained and unloved. The MySQL Manual even claims that it supports encryption… with the caveat of having to use the SQL functions for encryption/decryption rather than in the engine itself (so, basically, it supports encryption about as much as every other engine does).

The only “recent” innovation in the MEMORY engine was the dynamic row patch that ended up making its way into Percona Server (and isn’t enabled by default). This forced me to go and look at the code of the MEMORY engine again and I cannot possibly drink enough in my lifetime to erase the memory.

The MEMORY engine is used by just about everybody as you probably have a SQL query somewhere that uses an in memory temporary table. I can, however, feel the comments being added to this post right now by people who use gdb to set server variables that not a single query in their systems use MEMORY….. (IIRC there have been some patches around that would throw an error rather than create a temporary table)

We had a early version of the dynamic row format patch in Drizzle for a while… and if you turned it on, all sorts of things horrifically broke. It was a remarkably non-trivial amount of work to get that code to work properly and this is largely a testament to the “design” of the MEMORY engine.

While it may be efficient or fast or something (likely on 1990s hardware and workloads), it misses the boat completely on the things that matter today: simultaneous access, MVCC, BLOB/TEXT columns and transactions. Basically, it’s a engine that’s really only useful for a single connection in limited use cases…. and even then, it’s likely a good way to ruin things. MyISAM is better as at least on memory pressure things may be written out to disk sensibly…. and if InnoDB had a “don’t log this table” mode it would beat that absolute pants off it.

It is, again, another part of the MySQL server that’s remarkably hard to pull out and replace with something different/better. Why? Well, I wrote about it before: Refactoring Internal temporary tables (another stab at it). If it was easy, we’d likely have Tokyo Cabinet (via BlitzDB) or similar (some bit of code maintained by other people) doing the same job in Drizzle rather than this large chunk of code that nobody really cares about.

Detecting if a MySQL server supports partitioning

This morning, this Percona XtraBackup bug came to my attention: https://bugs.launchpad.net/bugs/1170340 – basically, it’s now really quite tricky to determine if a MySQL server you’re connected to supports partitioning or not.

If you’re connected to anything less than MySQL 5.6, you can use have_partitioning variable. But since that’s gone in 5.6, you’re going to get a false negative if you’re connected to 5.6. You could use INFORMATION_SCHEMA.PLUGINS table, but that’s not there in 5.0, so you have some added workarounds to add there too.

A simple version check could be the solution… but what if you compiled the server without partitioning support?

MySQL Community Portraits needs your $$

There’s an Indiegogo fundraising effort for Julian Cash to come to the Percona Live MySQL Conference and Expo this year to take your photo! I’ve thrown in a bunch of money to help make this happen. Why? It’s much much much cheaper than getting any professional photo shoot done, and it’s by the awesome Julian, who makes pretty awesome photos.

You’ll get full resolution images too! Basically, this is the cheapest way you’re going to get this quality of photos done of yourself outside of dating a professional portrait photographer. Since I’m not dating a professional portrait photographer, this is an excellent and affordable way to have some truly awesome and up to date photos that I can use.

I went and bought the Gold level sponsorship as I feel that $250 is a small price to pay (especially considering Julian’s previous excellent work). You can support it with more $ or less $ (even $0), I chose $250 as it meant I could claim, at least for a short time, that I’ve over half funded it :)

Sessions at the Percona Live MySQL Conference that interest me

For the past many years, there’s been a conference in April, at the Santa Clara Convention Centre where the topic has been MySQL and the surrounding ecosystem. The first year I went, I gave a talk on the new features in MySQL Cluster 5.1 to a overflowing room of attendees. For me, it’s an event that’s mixed with speaking about something I’ve been working on and talking to other attendees about everything from how a particular part of the server works to where we can escape to for nearby good vegan food.

So, I thought I’d share some of the sessions that I’m really looking forward to. My selection is probably atypical, but may be interesting to others. I’m not going to list the keynotes, although they are often of a lot of value. I’m also going to attempt to avoid listing a few really awesome well known speakers simply because there are other really interesting sessions that also need exposure!

  • Starring Sakila: Building Data Warehouses and BI solutions using MySQL and Pentaho
    I need to base decisions off data, not simply a gut feeling (I’m not Stephen Colbert after all). I ran into a bunch of stumbling blocks when trying to work with Pentaho a couple of weeks ago, and I’m really hoping that this session shines some light on how to use it to better and more easily make arguments based on evidence to others in the company.
  • Testing MySQL Databases: The State Of The Art
    I’ve worked with Patrick for several years now, and he’s currently a valuable member of my team at Percona. For those who are interested in the state of the art of open source database testing, this is the session to be in.
  • Getting InnoDB Compression Ready for Facebook Scale
    This session is on at the same time as I’m speaking, so I probably won’t be able to attend (people keep coming to my sessions so I usually can’t sneak out). I’m really interested in how they’ve modified the compression code to help with their (large) workload.
  • Backing Up Facebook
    I hear that Facebook has a couple of database servers, a few dozen users and a few floppy disks full of data. This should be a fun story :)
  • Introducing XtraBackup Manager
    Being responsible for XtraBackup development at Percona, the XtraBackup topics really interest me. Lachlan has been working on a simple backup manager for XtraBackup to help create something that is a more complete backup solution than a tool which simply creates a backup.
  • Extending Xtrabackup – A Point-In-Time System
    Another good case of using XtraBackup as part of a comprehensive backup strategy. I have to be honest, I’m looking for ways in which we can improve XtraBackup to better fit the needs of people. It may be that there are a few small things we can do to make it easier for people do deploy and use.
  • Getting Started with Drizzle 7.1
    We’re about to do the 7.1 release of Drizzle! If you’re interested in having a SQL database that is designed to be used in large scale web applications and cloud environments, come along to this talk.
  • MySQL Idiosyncrasies That Bite
    I have to admit, I’m interested in Ronalds talk here to basically ensure we didn’t miss fixing anything in Drizzle. I do promise not to at any point yell out “Fixed in Drizzle” though.

Go here to register: http://www.percona.com/live/mysql-conference-2012/ (early bird pricing and discounted hotel rooms end March 12th, so you want to register sooner rather than later).

Information on Bug#12704861 (which doesn’t exist in any public bug tracker)

Some of you may be aware that MySQL is increasingly using an Oracle-internal bug tracker. You can see these large bug numbers mentioned alongside smaller public bug numbers in recent MySQL release notes. If you’re particularly unlucky, you  just get a big Oracle-internal bug number. For a recently fixed bug, I dug further, posted up on the Percona blog: http://www.mysqlperformanceblog.com/2011/11/20/bug12704861/

Possibly interesting reading for those of you who interested in InnoDB, MySQL, BLOBs and crash recovery.

Using Jenkins to parse sphinx warnings

At Percona, we’re now using sphinx for our documentation. We’re also using Jenkins for our  continuous integration. We have compiler warnings from GCC being parsed by Jenkins using the built in filters, but there isn’t one for the sphinx warnings.

Luckily, in the configuration page for Jenkins, the Warnings plugin allows you to specify your own filters. I’ve added the following filter to process warnings from sphinx:

For those who want to copy and paste:

Regex: ^(.*):(\d+): \((.*)\) (.*)

Mapping Script

import hudson.plugins.warnings.parser.Warning
String fileName = matcher.group(1)
String lineNumber = matcher.group(2)
String category = matcher.group(3)
String message = matcher.group(4)

return new Warning(fileName, Integer.parseInt(lineNumber), "sphinx", category, message);

Example log message: /home/stewart/percona-server/docs-5.1/doc/source/release-notes/Percona-Server-1.0.2-3.rst:67: (WARNING/2) Inline literal start-string without end-string.

Then I can select this filter from the job that builds (and publishes) our documentation and it shows up like any other compiler warnings. Neat!

TODO: get the intersphinx warnings also in there

TODO: fix the linkcheck target in Sphinx so that it’s easily parseable and can also be integrated.

xtrabackup bazaar repositories upgraded to 2a format

I have just upgraded the xtrabackup bazaar code repositories to the 2a format. This means that bzr 1.16 is required to access the source code repositories now.

If you get an error like the one below when working with a local branch, you’ll need to run “bzr upgrade” in it (see below for example). For branches on launchpad, you can use the web UI and hit the “upgrade branch” button.

stewart@willster:~/src/percona-xtrabackup$ bzr pull
Using saved parent location: bzr+ssh://bazaar.launchpad.net/%2Bbranch/percona-xtrabackup/
Doing on-the-fly conversion from RemoteRepositoryFormat(_network_name='Bazaar repository format 2a (needs bzr 1.16 or later)\n') to RepositoryFormatKnitPack1().
This may take some time. Upgrade the repositories to the same format for better performance.
bzr: ERROR: KnitPackRepository('file:///home/stewart/src/percona-xtrabackup/.bzr/repository/')
is not compatible with
RemoteRepository(bzr+ssh://bazaar.launchpad.net/%2Bbranch/percona-xtrabackup/.bzr/)
different rich-root support
stewart@willster:~/src/percona-xtrabackup$ bzr upgrade
Upgrading branch file:///home/stewart/src/percona-xtrabackup/ ...              
starting upgrade of file:///home/stewart/src/percona-xtrabackup/
making backup of file:///home/stewart/src/percona-xtrabackup/.bzr
  to file:///home/stewart/src/percona-xtrabackup/backup.bzr.~1~
starting repository conversion                                                 
repository converted                                                           
finished

Drizzle online backup with xtrabackup

For backups, historically in the MySQL world you’ve had mysqldump (a SQL dump, means on restore you have to rebuild indexes), InnoDB Hot Backup (proprietary, but takes a copy of the InnoDB data files, so restore is much quicker), LVM snapshots (various scripts exist, does have larger IO impact, requires LVM) and more recently xtrabackup. Xtrabackup essentially does the same thing as InnoDB hot backup except that it’s free and open source software.

Many people have been using xtrabackup successfully for quite a while now.

In Drizzle7, our default storage engine is InnoDB. There have been a few changes, but it is totally InnoDB. This leaves us with the question of backup solutions. We have drizzledump (the Drizzle equivalent to MySQL dump – although with fewer gotchas), you could always use LVM snapshots and the probability of Oracle releasing InnoDB Hot Backup for Drizzle is rather minimal.

So enter xtrabackup as a possible solution… I had though of porting xtrabackup across for a while. Last weekend, while waiting for one of my iterations of catalog support to compile, I decided to give it a go. I wanted to see how far I could get with it also in that weekend.

I was successful – there’s a tree up at lp:~stewart/drizzle/xtrabackup thatproduces an xtrabackup binary that’s built for Drizzle (it’s not quite ready for merging yet, there are some obivous bugs around command line option parsing… but a backup and restore did work).

I wanted the following:

  • build to be integrated with Drizzle, using the same innobase build that we use to build the server
  • build with strict compiler warnings and -Werror (which we do forDrizzle)
  • build with a C++ compiler (as we do with innobase in Drizzle)
  • not re-add parts of mysys into the Drizzle build just for xtrabackup

I’ve already submitted merge requests to upstream xtrabackup containing the compiler fixes and added compiler warnings (they’ve also by now been merged into xtrabackup). Already my work has improved the quality of xtrabackup for everyone. Some of the warnings were fixed slightly differently in xtrabackup than in my Drizzle tree, but I plan to merge.

One issue was that the command line parsing library that xtrabackup uses – my_getopt which is part of mysys (the portability library inside MySQL) is long since gone from Drizzle. We currently use Boost::program_options. Thanks to the heroic efforts of Andrew Hutchings, xtrabackp in Drizzle is also using boost::program_options. This was a brilliant “hey, can you have a look at this conversion” followed by handing him a tree that did not even remotely compile, followed by a “I have to take the kids somewhere, here’s a tree – it may compile”. Amazingly enough, it pretty much did compile once I fixed the other issues.

An unresolved issue is how to deal with this going forward – my guess is that upstream xtrabackup doesn’t want to require Boost.

One solution could be just to factor out command line options into a sepfile that we can ignore for Drizzle and replace with our own. The other option could be to use a differnt command line option parsing library (perhaps from CCAN, as it’s then maintained by somebody else and doesn’t require heaps and heaps of other stuff).

Another issue I had to tackle is the patch to innobase that’s required to build xtrabackup.

I took a very minimal approach for the Drizzle patch. We are currently based on innobase 1.1.4 from MySQL 5.5 – so I mostly looked at the xtradb55 patch. I think it would be great if these were instead of one giant patch a series of patches to apply (a-la quilt) to a) make iteasier maintain and b) easier for myself to work out the exact reasoning of each bit (also, generating the patches with -p would help a fair bit too).

So how did I do it?

Step 0
was removing support for old innobase – we totally don’t need it for Drizzle.

Step 1
was creating a srv_read_only option for Drizzle’s innobase. This was fairly easy. The one thing I did have to change was adding a checkin os_file_lock() so that we don’t attempt to write lock the ibdatafiles when in read only (otherwise backups can’t be taken while drizzledis running). I’m a little surprised that this wasn’t hit in 5.5 at all.

Step 2
was implementing srv_fake_write. I’m pretty sure I’ve gotten this right in the Drizzle implementation, but the patch wasn’t as easy toread as I’d really like. I probably need to do a bit more of a code audit that this is actually correct (I may try and come up with anLD_PRELOAD library that will scream loudly if writes are made to files matching a pattern).

Step 3
was implemnting srv_apply_log_only. Pretty sure I have this right, again, more testing will be required. Why? Because I’m that paranoid about getting things very, very right.

Step 4
was to go through all the functions that xtrabackup needed to not be static. Instead of having prototypes for them inxtrabackup.cc, I instead added a xtrabackup_api.h header to Innobase and included it where needed (including in xtrabackup). I’d recommend this way going forward for xtrabackup too as it could be a lot less problematic to maintain (and makes xtrabackup source a bit easier to read)

Step 5
was fixing up a few skeleton functions that were needed to make our innobase happy. It may not be a bad idea to split out the skeleton functions into a sep source file so it’s a bit easier to track (and some #ifdefs around those not needed for certain releases).

I’m hoping to work with the upstream xtrabackup devs on the various points I’ve made above.
Another thought of mine is to port xtrabackup into HailDB where we can use much more neat API functions to create good tests for xtrabackup.

Thanks go out to all who’ve worked on xtrabackup. It honestly wasn’t too hardgetting it ported across to Drizzle – and with a bit of collaboration I think we can make it easy to keep up to date.

What’s the future for Xtrabackup in Drizzle? It’ll likely end up being a binary named drizzlebackup-innobase or similar (this means that there is a clear difference between xtrabackup for MySQL and what we have in Drizzle – which is more accurately defined as based on xtrabackup). We’ll also probably want a nice wrapper or integration with a backup tool to deal with everything Drizzle related. We shall also introduce a lot of testing; backups are important.

Xtrabackup is topical, check out the latest OurSQL podcast and the the Percona Xtrabackup website for more info!