For backups, historically in the MySQL world you’ve had mysqldump (a SQL dump, means on restore you have to rebuild indexes), InnoDB Hot Backup (proprietary, but takes a copy of the InnoDB data files, so restore is much quicker), LVM snapshots (various scripts exist, does have larger IO impact, requires LVM) and more recently xtrabackup. Xtrabackup essentially does the same thing as InnoDB hot backup except that it’s free and open source software.
Many people have been using xtrabackup successfully for quite a while now.
In Drizzle7, our default storage engine is InnoDB. There have been a few changes, but it is totally InnoDB. This leaves us with the question of backup solutions. We have drizzledump (the Drizzle equivalent to MySQL dump – although with fewer gotchas), you could always use LVM snapshots and the probability of Oracle releasing InnoDB Hot Backup for Drizzle is rather minimal.
So enter xtrabackup as a possible solution… I had though of porting xtrabackup across for a while. Last weekend, while waiting for one of my iterations of catalog support to compile, I decided to give it a go. I wanted to see how far I could get with it also in that weekend.
I was successful – there’s a tree up at lp:~stewart/drizzle/xtrabackup thatproduces an xtrabackup binary that’s built for Drizzle (it’s not quite ready for merging yet, there are some obivous bugs around command line option parsing… but a backup and restore did work).
I wanted the following:
- build to be integrated with Drizzle, using the same innobase build that we use to build the server
- build with strict compiler warnings and -Werror (which we do forDrizzle)
- build with a C++ compiler (as we do with innobase in Drizzle)
- not re-add parts of mysys into the Drizzle build just for xtrabackup
I’ve already submitted merge requests to upstream xtrabackup containing the compiler fixes and added compiler warnings (they’ve also by now been merged into xtrabackup). Already my work has improved the quality of xtrabackup for everyone. Some of the warnings were fixed slightly differently in xtrabackup than in my Drizzle tree, but I plan to merge.
One issue was that the command line parsing library that xtrabackup uses – my_getopt which is part of mysys (the portability library inside MySQL) is long since gone from Drizzle. We currently use Boost::program_options. Thanks to the heroic efforts of Andrew Hutchings, xtrabackp in Drizzle is also using boost::program_options. This was a brilliant “hey, can you have a look at this conversion” followed by handing him a tree that did not even remotely compile, followed by a “I have to take the kids somewhere, here’s a tree – it may compile”. Amazingly enough, it pretty much did compile once I fixed the other issues.
An unresolved issue is how to deal with this going forward – my guess is that upstream xtrabackup doesn’t want to require Boost.
One solution could be just to factor out command line options into a sepfile that we can ignore for Drizzle and replace with our own. The other option could be to use a differnt command line option parsing library (perhaps from CCAN, as it’s then maintained by somebody else and doesn’t require heaps and heaps of other stuff).
Another issue I had to tackle is the patch to innobase that’s required to build xtrabackup.
I took a very minimal approach for the Drizzle patch. We are currently based on innobase 1.1.4 from MySQL 5.5 – so I mostly looked at the xtradb55 patch. I think it would be great if these were instead of one giant patch a series of patches to apply (a-la quilt) to a) make iteasier maintain and b) easier for myself to work out the exact reasoning of each bit (also, generating the patches with -p would help a fair bit too).
So how did I do it?
was removing support for old innobase – we totally don’t need it for Drizzle.
was creating a srv_read_only option for Drizzle’s innobase. This was fairly easy. The one thing I did have to change was adding a checkin os_file_lock() so that we don’t attempt to write lock the ibdatafiles when in read only (otherwise backups can’t be taken while drizzledis running). I’m a little surprised that this wasn’t hit in 5.5 at all.
was implementing srv_fake_write. I’m pretty sure I’ve gotten this right in the Drizzle implementation, but the patch wasn’t as easy toread as I’d really like. I probably need to do a bit more of a code audit that this is actually correct (I may try and come up with anLD_PRELOAD library that will scream loudly if writes are made to files matching a pattern).
was implemnting srv_apply_log_only. Pretty sure I have this right, again, more testing will be required. Why? Because I’m that paranoid about getting things very, very right.
was to go through all the functions that xtrabackup needed to not be static. Instead of having prototypes for them inxtrabackup.cc, I instead added a xtrabackup_api.h header to Innobase and included it where needed (including in xtrabackup). I’d recommend this way going forward for xtrabackup too as it could be a lot less problematic to maintain (and makes xtrabackup source a bit easier to read)
was fixing up a few skeleton functions that were needed to make our innobase happy. It may not be a bad idea to split out the skeleton functions into a sep source file so it’s a bit easier to track (and some #ifdefs around those not needed for certain releases).
I’m hoping to work with the upstream xtrabackup devs on the various points I’ve made above.
Another thought of mine is to port xtrabackup into HailDB where we can use much more neat API functions to create good tests for xtrabackup.
Thanks go out to all who’ve worked on xtrabackup. It honestly wasn’t too hardgetting it ported across to Drizzle – and with a bit of collaboration I think we can make it easy to keep up to date.
What’s the future for Xtrabackup in Drizzle? It’ll likely end up being a binary named drizzlebackup-innobase or similar (this means that there is a clear difference between xtrabackup for MySQL and what we have in Drizzle – which is more accurately defined as based on xtrabackup). We’ll also probably want a nice wrapper or integration with a backup tool to deal with everything Drizzle related. We shall also introduce a lot of testing; backups are important.