xtrabackup for Drizzle merge request

Follow it over on launchpad.

After having fixed an incredibly odd compiler warning (and with -Werror that we build with, error) on OSX (die die die) – xtrabackup for Drizzle is ready to be merged. This will bring it into our next milestone: freemont. Over the next few weeks you should see some good tests merged in for backup and restore too.

While not final final, I’m thinking that the installed binary name will be drizzlebackup.innobase. A simple naming scheme for various backup tools that are Drizzle specific. This casually pre-empts a drizzlebackup tool that can co-ordinate all of these (like the innobackupex script).

Drizzle online backup with xtrabackup

For backups, historically in the MySQL world you’ve had mysqldump (a SQL dump, means on restore you have to rebuild indexes), InnoDB Hot Backup (proprietary, but takes a copy of the InnoDB data files, so restore is much quicker), LVM snapshots (various scripts exist, does have larger IO impact, requires LVM) and more recently xtrabackup. Xtrabackup essentially does the same thing as InnoDB hot backup except that it’s free and open source software.

Many people have been using xtrabackup successfully for quite a while now.

In Drizzle7, our default storage engine is InnoDB. There have been a few changes, but it is totally InnoDB. This leaves us with the question of backup solutions. We have drizzledump (the Drizzle equivalent to MySQL dump – although with fewer gotchas), you could always use LVM snapshots and the probability of Oracle releasing InnoDB Hot Backup for Drizzle is rather minimal.

So enter xtrabackup as a possible solution… I had though of porting xtrabackup across for a while. Last weekend, while waiting for one of my iterations of catalog support to compile, I decided to give it a go. I wanted to see how far I could get with it also in that weekend.

I was successful – there’s a tree up at lp:~stewart/drizzle/xtrabackup thatproduces an xtrabackup binary that’s built for Drizzle (it’s not quite ready for merging yet, there are some obivous bugs around command line option parsing… but a backup and restore did work).

I wanted the following:

  • build to be integrated with Drizzle, using the same innobase build that we use to build the server
  • build with strict compiler warnings and -Werror (which we do forDrizzle)
  • build with a C++ compiler (as we do with innobase in Drizzle)
  • not re-add parts of mysys into the Drizzle build just for xtrabackup

I’ve already submitted merge requests to upstream xtrabackup containing the compiler fixes and added compiler warnings (they’ve also by now been merged into xtrabackup). Already my work has improved the quality of xtrabackup for everyone. Some of the warnings were fixed slightly differently in xtrabackup than in my Drizzle tree, but I plan to merge.

One issue was that the command line parsing library that xtrabackup uses – my_getopt which is part of mysys (the portability library inside MySQL) is long since gone from Drizzle. We currently use Boost::program_options. Thanks to the heroic efforts of Andrew Hutchings, xtrabackp in Drizzle is also using boost::program_options. This was a brilliant “hey, can you have a look at this conversion” followed by handing him a tree that did not even remotely compile, followed by a “I have to take the kids somewhere, here’s a tree – it may compile”. Amazingly enough, it pretty much did compile once I fixed the other issues.

An unresolved issue is how to deal with this going forward – my guess is that upstream xtrabackup doesn’t want to require Boost.

One solution could be just to factor out command line options into a sepfile that we can ignore for Drizzle and replace with our own. The other option could be to use a differnt command line option parsing library (perhaps from CCAN, as it’s then maintained by somebody else and doesn’t require heaps and heaps of other stuff).

Another issue I had to tackle is the patch to innobase that’s required to build xtrabackup.

I took a very minimal approach for the Drizzle patch. We are currently based on innobase 1.1.4 from MySQL 5.5 – so I mostly looked at the xtradb55 patch. I think it would be great if these were instead of one giant patch a series of patches to apply (a-la quilt) to a) make iteasier maintain and b) easier for myself to work out the exact reasoning of each bit (also, generating the patches with -p would help a fair bit too).

So how did I do it?

Step 0
was removing support for old innobase – we totally don’t need it for Drizzle.

Step 1
was creating a srv_read_only option for Drizzle’s innobase. This was fairly easy. The one thing I did have to change was adding a checkin os_file_lock() so that we don’t attempt to write lock the ibdatafiles when in read only (otherwise backups can’t be taken while drizzledis running). I’m a little surprised that this wasn’t hit in 5.5 at all.

Step 2
was implementing srv_fake_write. I’m pretty sure I’ve gotten this right in the Drizzle implementation, but the patch wasn’t as easy toread as I’d really like. I probably need to do a bit more of a code audit that this is actually correct (I may try and come up with anLD_PRELOAD library that will scream loudly if writes are made to files matching a pattern).

Step 3
was implemnting srv_apply_log_only. Pretty sure I have this right, again, more testing will be required. Why? Because I’m that paranoid about getting things very, very right.

Step 4
was to go through all the functions that xtrabackup needed to not be static. Instead of having prototypes for them inxtrabackup.cc, I instead added a xtrabackup_api.h header to Innobase and included it where needed (including in xtrabackup). I’d recommend this way going forward for xtrabackup too as it could be a lot less problematic to maintain (and makes xtrabackup source a bit easier to read)

Step 5
was fixing up a few skeleton functions that were needed to make our innobase happy. It may not be a bad idea to split out the skeleton functions into a sep source file so it’s a bit easier to track (and some #ifdefs around those not needed for certain releases).

I’m hoping to work with the upstream xtrabackup devs on the various points I’ve made above.
Another thought of mine is to port xtrabackup into HailDB where we can use much more neat API functions to create good tests for xtrabackup.

Thanks go out to all who’ve worked on xtrabackup. It honestly wasn’t too hardgetting it ported across to Drizzle – and with a bit of collaboration I think we can make it easy to keep up to date.

What’s the future for Xtrabackup in Drizzle? It’ll likely end up being a binary named drizzlebackup-innobase or similar (this means that there is a clear difference between xtrabackup for MySQL and what we have in Drizzle – which is more accurately defined as based on xtrabackup). We’ll also probably want a nice wrapper or integration with a backup tool to deal with everything Drizzle related. We shall also introduce a lot of testing; backups are important.

Xtrabackup is topical, check out the latest OurSQL podcast and the the Percona Xtrabackup website for more info!

Things I’ve done in Drizzle

When writing my Dropping ACID: Eating Data in a Web 2.0 Cloud World talk for LCA2011 I came to the realisation that I had forgotten a lot of the things I had worked on in MySQL and MySQL Cluster. So, as a bit of a retrospective as part of the Drizzle7 GA release, I thought I might try and write down a (incomplete) list of the various things I’ve worked on in Drizzle.

I noticed I did a lot of code removal, that’s all fine and dandy but maybe I won’t list all of that… except perhaps my first branch that was merged :)

2008

  • First ever branch that was merged: some mysys removal (use POSIX functions instead of wrappers that sometimes have different semantics than their POSIX functions), some removal of NETWARE, build scripts that weren’t helpful (i.e. weren’t what any build team ever used to build a release) and some other dead code removal.
  • Improve ‘make test’ time – transactions FTW! (this would be a theme for me over the years, I always want build and test to be faster)
  • Started moving functions out into their own plugins, removing the difference between UDFs (User Defined Functions) and builtin functions. One API to rule them all.
  • Ahhh compiler warnings (we now build with -Werror) and unchecked return codes from system calls (we now add this to some of our own methods, finding and fixing even more bugs). We did enable a lot of compiler warnings and OMG fix a lot of them.
  • Removal of FRM – use a protobuf message to describe the table. The first branch that implemented any of this was merged mid-November. It was pretty minimal, and we still had the FRM around for a little while yet – but it was the beginning of the end for features that couldn’t be implemented due to limitations in the FRM file format. I wrote a few blog entries about this.
  • A lot of test fixes for Drizzle
  • After relating the story of .test (hint: it broke the ability to check out the source tree on Solaris) to Anthony Baxter, he suggested trying something rather nasty… a unicode character above 2^16 – I chose 𝄢 – which at the time didn’t even render on Ubuntu – this was the test case you could only see correctly on MacOS X at the time (or some less broken Linux distro I guess). Some time later, I was actually able to view the test file on Ubuntu correctly.
  • I think it was November 1st when I started to work on Drizzle full time – this was rather awesome, although a really difficult decision as I did rather enjoy working with all the NDB guys.

2009:

  • January sparked the beginning of reading table information from the table protobuf message instead of the FRM file. The code around the FRM file was lovely and convoluted in places – to this day I’m surprised at the low bug count all that effort resulted in. One day I may write up a list of bugs and exploits probably available through the FRM code.
  • My hate for C++ fstream started in Feb 2009
  • In Feb I finally removed the code to work out (and store) in the FRM file how to display a set of fields for entering data into the table on a VT100 80×24 screen.
  • I filed my first GCC bug. The morning started off with a phone call and Brian asking me to look at some strange bug and ended with Intel processor manuals (mmm… the 387 is fun), the C language standard and (legitimately) finding it was a real compiler bug. Two hours and two minutes after filing the bug there was a patch to GCC fixing it – I was impressed.
  • By the end of Feb, the FRM was gone.
  • I spent a bit of time making Drizzle on linux-sparc work. We started building with compiler options that should be much friendlier to fast execution on SPARC processors (all to do with aligning things to word boundaries). -Wcast-align is an interesting gcc flag
  • Moved DDL commands to be in StorageEngine rather than handler for an instance of an open table (now known as Cursor)
  • MyISAM and CSV as temporary table only engines – this means we save a bunch of mucking about in the server.
  • Amazingly enough, sprintf is dangerous.
  • Moved to an API around what tables exist in a database so that the Storage Engines can own their own metadata.
  • Move to have Storage Engines deal with the protobuf table message directly.
  • Found out you should never assume that your process never calls fork() – if you use libuuid, it may. If it wasn’t for this and if we had param-build, my porting of mtr2 to Drizzle probably would have gone in.
  • There was this thing called pack_flag – its removal was especially painful.
  • Many improvements to the table protobuf message (FRM replacement) code and format – moving towards having a file format that could realistically be produced by code that wasn’t the Drizzle database kernel.
  • Many bug fixes, some acused by us, others exposed by us, others had always been there.

2010:

  • embedded_innodb storage engine (now HailDB). This spurred many bug and API fixes in the Storage Engine interface. This was an education in all the corner cases of the various interfaces, where the obvious way to do things was almost always not fully correct (but mostly worked).
  • Engine and Schema custom KEY=VALUE options.
  • Found a bug in the pthread_mutex implementation of the atomics<> template we had. That was fun tracking down.
  • started writing more test code for the storage engine interface and upper layer. With our (much improved) storage engine interface, this became relatively easy to implement a storage engine to test specific bits of the upper layer.
  • Wrote a CREATE TABLE query that would take over four minutes to run. Fixed the execution time too. This only existed because of a (hidden and undocumented) limitation in the FRM file format to do with ENUM columns.
  • Characters versus bytes is an important difference, and one that not all of the code really appreciated (or dealt with cleanly)
  • FOREIGN KEY information now stored in the table protobuf message (this was never stored in the FRM).
  • SHOW CREATE TABLE now uses a library that reads the table protobuf message (this will later be used in the replication code as well)
  • Started the HailDB project
  • Updated the innobase plugin to be based on the current InnoDB versions.
  • Amazingly, noticed that the READ_COMMITTED isolation level was never tested (even in the simplest way you would ever explain READ_COMMITTED).
  • Created a storage engine for testing called storage_engine_api_tester (or SEAPITester). It’s a dummy engine that checks we’re calling things correctly. It exposed even more bugs and strangeness that we weren’t aware of.
  • Fixed a lot of cases in the code where we were using a large stack frame in a function (greater than 32kb).
  • An initial patch using the internal InnoDB API to store the replication log

2011:

  • This can be detailed later, it’s still in progress :)
  • The big highlight: a release.

Persistent index statistics for InnoDB

In browsing the BZR tree for lp:mysql-server, I noticed some rather exciting code had been merged into the Innobase code.

You may be aware that InnoDB will do some index dives when opening a table to get some statistics about the indexes that can help the optimiser make good query plans.

The problem being that this is many disk seeks. It means that on server restart, you have to spend a whole bunch of time seeking around the disk reading index pages.

Not any more.

There is now code merged in to store the calculated statistics in a table inside InnoDB so that these index dives don’t have to happen on startup.

Originally, this looked like it was going to make it into InnoDB+. The good news is that it’s now in a public source tree. I look forward to when it hits a stable release.

(hopefully somebody other than me can beat me to it and write a nice description of the algorithms involved… the code is pretty easy to follow, so it shouldn’t be hard)

Drizzle gets InnoDB 1.0.9

My branch that updates the innobase plugin in Drizzle to be based on innodb_plugin 1.0.9 has been merged. For the next milestone, we’ll probably have 1.0.11 as well.

How’s the progress getting 1.1 and 1.2 in? Pretty good actually. We’ll have it for either this milestone or the next one.

and merging newer innodb into HailDB? It’s going well too, expect more news “soon”.

Second Drizzle Beta (and InnoDB update)

We just released the latest Drizzle tarball (2010-10-11 milestone). There are a whole bunch of bug fixes, but there are two things that are interesting from a storage engine point of view:

  • The Innobase plugin is now based on innodb_plugin 1.0.6
  • The embedded_innodb engine is now named HailDB and requires HailDB, it can no longer be built with embedded_innodb.

Those of you following Drizzle fairly closely have probably noticed that we’ve lagged behind in InnoDB versions. I’m actively working on fixing that – both for the innobase plugin and for the HailDB library.

If building the HailDB plugin (which is planned to replace the innobase plugin), you’ll need the latest HailDB release (which as of writing is 2.3.1). We’re making good additions to the HailDB API to enable the storage engine to have the same features as the Innobase plugin.