MySQL Community Portraits needs your $$

There’s an Indiegogo fundraising effort for Julian Cash to come to the Percona Live MySQL Conference and Expo this year to take your photo! I’ve thrown in a bunch of money to help make this happen. Why? It’s much much much cheaper than getting any professional photo shoot done, and it’s by the awesome Julian, who makes pretty awesome photos.

You’ll get full resolution images too! Basically, this is the cheapest way you’re going to get this quality of photos done of yourself outside of dating a professional portrait photographer. Since I’m not dating a professional portrait photographer, this is an excellent and affordable way to have some truly awesome and up to date photos that I can use.

I went and bought the Gold level sponsorship as I feel that $250 is a small price to pay (especially considering Julian’s previous excellent work). You can support it with more $ or less $ (even $0), I chose $250 as it meant I could claim, at least for a short time, that I’ve over half funded it :)

Where are they now: MySQL Storage Engines

There was once a big hooplah about the MySQL Storage Engine Architecture and how it was easy to just slot in some other method of storage instead of the provided ones. Over the years I’ve repeatedly mentioned how this wasn’t really the case and that it was remarkably non trivial.

Over the years there have been many storage engines crop up and then disappear. So… where are they now?

  • ISAM
    This became MyISAM…. you know you’ve been around MySQL a long time if you’ve ever had to deal with an ISAM table.
  • Gemini
    This was the first big test of the GPL in court. Basically, you have to obey the GPL (see wikipedia for more info). The code was released as GPL and development stopped. This has been dead since ca 2002.
  • Amira – http://launchpad.net/amira
    Antony first mentioned this in 2008 on his blog. This was a continuation of the Gemini engine, you can actually go over to launchpad and get the code. This was one of the projects to have a transactional storage engine not owned by Oracle after Innobase Oy was acquired by them. It went nowhere special as Netfrastructure was acquired which became Falcon.
  • BDB
    otherwise known as the BerkeleyDB engine. It was seldom used and never gained much of a userbase. It was unceremoniously dropped back in 2006 and both users didn’t really exist.
  • PBXT - http://pbxt.blogspot.com/
    I think we can credit PBXT with at least half of the features and performance improvements to InnoDB since it first emerged back in 2006. It got attention very quickly. Why? Because it was different. It had the very rare ability to outperform InnoDB in some places. You can still find PBXT in MariaDB, but sadly it can be hard to fund development of a MySQL storage engine, especially one as tied to MySQL as PBXT is, and it’s no longer under active development. Closely related was the Blob Streaming project which was way ahead of its time as an AlsoSQL access method. The good news is that the code was released under a BSD license in 2012 (was previously GPL). We even had PBXT in Drizzle for a while.
  • Blob Streaming (PBMS) - http://bpbdev.blogspot.com/
    This project was closely related to (but not depending exclusively on) PBXT. It embedded a HTTP server inside the database and could use it to read and write BLOBs. This was not only fairly cool but way ahead of its time. We owe the existence of both HandlerSocket and the memcached interface to InnoDB to PBMS (it was also an inspiration for the JSON server plugin for Drizzle, to address some of the use cases of the PBMS plugin).
  • Federated
    It’s still there… but is effectively unmaintained and dead. There’s even FederatedX in MariaDB which is an improvement, but still, the MySQL server really doesn’t lend itself kindly to this type of engine… it’s always been an oddity only suitable for very specific tasks.
  • Archive
    Although useful, effectively unmaintained. I kinda don’t want to say dead… but if it went away, I wouldn’t exactly be surprised.
  • CSV
    Currently used to access the log tables in MySQL… and hardly used otherwise. It’s odd that the same code doesn’t deal with SELECT INTO OUTFILE and LOAD DATA INFILE, and I doubt this will ever change. I’d say effectively niche/dead.
  • SolidDB
    Purchased by IBM, abandoned.
  • DB2
    Only ever on System i. Useful for very very few people… but you can still find it around if you’re one of them.
  • Infobright
    OMG it exists! This is probably because they’re largely just using the MySQL server as a way to implement the MySQL network protocol and all of the heavy lifting is done by their own code.
  • Xeround
    I’m quite surprised these guys are still around, as they’re a proprietary storage engine as a service, and initial testing wasn’t entirely promising.
  • TokuDB
    I cannot emphasize how much more interesting TokuDB would be if it were open source. It actually holds some promise… and with their recent work with mongo, perhaps this is a good way forward for them…
  • Maria/Aria
    Another “OMG Oracle just bought Innobase Oy” engine. This was a project to take MyISAM and turn it into a lean, mean, transactional storage engine machine. It’s still not there and I don’t think it ever will be.
  • Falcon
    This was the hot new thing. It came out of Netfrastructure, which MySQL AB acquired in order to help get a transactional storage engine after Innobase Oy was acquired by Oracle. If you’re keeping count, that’s three projects for a transactional storage engine. Falcon was the star though, receiving all the press and publicity (well before it was ready). There are many reasons why Falcon isn’t around today – the chief one probably being that Oracle bought Sun who had bought MySQL and thus a need for an “InnoDB replacement” instantly vanished. There was also immense management pressure for performance to be greater than InnoDB, without any allowance for or focus on correctness…. and this showed. This was quite disappointing as Falcon had a lot of good architectural things going for it.
  • BlitzDB - https://launchpad.net/blitzdb
    I had hoped we’d replace MyISAM with BlitzDB in Drizzle. It was a wrapper around Tokyo Cabinet to the storage engine API in Drizzle. Unfortunately, the ties to MyISAM are incredibly deep (see my recent post on internal temporary tables) and we never quite got there.

I think this is all the notable engines that were aimed at widespread adoption… what ones have I forgotten?

It’s interesting to note that only Archive, CSV, Xeround, TokuDB and Infobright can be gotten anywhere, and the latter two only in their own distribution (one proprietary) and Xeround only as a service.

30 configuration options and counting

While Domas may have rather effictively trolled the discussion with his post on howto configure table/user statistics (which gave me a good chuckle I do have to say), it’s at least incorrect for Percona Server as you have to enable the “userstat” server option :)

That being said, once enabled there are no extra configuration variables to think about. This is a huge advantage over configuring PERFORMANCE_SCHEMA – which has a total of THIRTY configuration options (31 if you include the global enable/disable option).

Some of these thirty odd configuration variables are only going to matter if you’re loading your own plugins, and even then, it’s probably only going to matter if they use the MySQL mutex implementations rather than, say, the standard pthread ones or even other synchronization primitives. It helps that the vast majority of non-InnoDB storage engines are dead. Go on – name one that’s in any form of usage (MyISAM doesn’t count – it’s effectively on death row).

This really makes me want to go and resurrect and finish the perf integration with Drizzle. The operating system provides a whole bunch of performance monitoring tools already, just expose them via SQL and be done with it.

Refactoring Internal temporary tables (another stab at it)

A few weekends ago, I started to again look at the code in Drizzle for producing internal temporary tables. Basically, we have a few type of tables:

  • Standard
  • Temporary (from CREATE TEMPORARY TABLE)
  • Temporary (from ALTER TABLE)
  • Internal temporary (to help with query execution)

If you’re lucky enough to be creating one of the first three types, you go through an increasingly lovely pile of code that constructs a nice protobuf message about what the table should look like and hands all responsibility over to the storage engine as to how to do that. The basic idea is that Drizzle gets the heck out of the way and lets the storage engine do its thing. This code path looks rather different than what we inherited from MySQL. For a start, we actually have a StorageEngine object rather than just lumping everything into the handler (which we correctly name a Cursor). However… the final part, the internal temporary table code is a bit closer to what we inherited from MySQL. There is a good reason for that, it’s ass.

For a start, the table::Singular object is still abused by Item_sum_distinct (see the setup() method) as a tuple (a table with no actual table). This is not ideal and just throws a spanner in the works for refactoring a bunch of code.

The second big problem is that create_tmp_table() doesn’t actually use any normal API calls, instead it manually sets up the table::Singular object. This includes setting up the fields for the table::Singular object in a slightly different way depending on which bit of code called create_tmp_table().

The third big problem is that it’s not storage engine agnostic. Instead of using any existing and sensible way to go and create a temporary table by using the storage engine API it instead creates a series of MI_COLUMNDEF structures which as you may be able to guess, are MyISAM specific and internal data structures.

The forth big problem is that if we end up using HEAP (again, like MyISAM, hard coded) we don’t even call the create table method on the engine. The HEAP (or MEMORY engine as it’s now known) is magic in that it can create tables on open()!

All of these issues make it really, really hard to have another engine with the ability to handle internal temporary tables. You may recall that MariaDB does include the ability to use the Aria engine for internal temporary tables. No, they did not refactor any of this code, they just made a copy of the code and put in Aria where MyISAM was along with some #ifdef for the feature.

Over the past several years I’ve tried a few times to tease this code out and start the process of turning it into something that is palatable. Every one of those times I’ve either failed or gotten sufficiently frustrated that I’ve given up.

I now have a new strategy though. After looking at the code for a good few hours a few weekends ago, I think I have an idea of where to start…. (now just for a few more free weekends to implement it).

INNODB_SYS_FIELDS vs INNODB_SYS_COLUMNS

In MySQL 5.6 we have two new INFORMATION_SCHEMA tables for InnoDB that are likely going to cause confusion: INNODB_SYS_FIELDS and INNODB_SYS_COLUMNS. You may think these are likely to just be aliases of each other in order to make your life easier. However…

These are not the same thing. The INNODB_SYS_FIELDS table is all about key columns (fields) of InnoDB indexes, while INNODB_SYS_COLUMNS is about actual columns. This is even more confusing as within the MySQL source code, there is the Field set of objects that manipulate fields (columns) in a row.

Blegh. I’m glad it’s Friday.

diffstat of MySQL 5.6 versus 5.5

Yesterday I wrote about what the diffstat between MySQL 5.5 and MariaDB 5.5 was, and previously to that, about the MariaDB code size as reported by sloccount. Let’s look at MySQL 5.6.

A naive wc based “lines of code” for MySQL 5.6 sql/ directory is ~490kLOC which contasts with MySQL 5.5 being ~375kLOC by the same measure. If we diffstat the sql/ directory like I did for MariaDB 5.5 we get:

 357 files changed, 172871 insertions(+), 67922 deletions(-)

Versus, as you remember from yesterday for MariaDB 5.5 over MySQL 5.5:

 250 files changed, 83639 insertions(+), 23090 deletions(-)

The MySQL 5.5 to 5.6 sql/ changes line up with What I found in my post MySQL modularity, are we there yet? in that the core server code for MySQL has grown by about 100,000 lines of code.

The jump from MySQL 5.5 to MariaDB 5.5 is a smaller one than jumping from MySQL 5.5 to MySQL 5.6, at least in terms of changed server code.

A judgement all on if a smaller diff is a safer jump or not will rest more with the quality of that code more than anything else. As we’ve seen previously, modularity isn’t coming to the MySQL code base any time soon.

So what about the diffstat of MariaDB compared to MySQL?

So, I’ve looked at what sloccount says on the differences between Oracle MySQL over versions of itself and the various MySQL branches around. What I haven’t looked at is the diffstat. Firstly, let’s look at MariaDB.

I’m going to look at MariaDB 5.5.29 as compared to MySQL 5.5.29, both checked out from bzr. A naive diffstat would give us:

 5261 files changed, 1086165 insertions(+), 122751 deletions(-)

And this looks like an awful lot of code that has changed: about 1,086,165 lines! This actually includes a whole other copy of InnoDB in the form of XtraDB. If we take that into account we get:

 5032 files changed, 864997 insertions(+), 125099 deletions(-)

Which is still incredibly high. Let’s look at what’s changed though. We actually see a bunch of changes in the test suite, some of which are relatively harmless, while others, like the change to rpl_tests/rpl_innodb.test have a “–replace_result MyISAM InnoDB” line added to them, which is awfully odd (possibly legitimate, but it stuck out).

In the end, I came up with this diff command which I think leaves us with a best diff for what is the code difference between MySQL 5.5 and MariaDB 5.5:

 diff -Nru --exclude=BUILD* --exclude=.bzr* --exclude debian* \
--exclude=man* --exclude=mysql-test* --exclude=win* \
--exclude=unittest* --exclude=test* \
--exclude=support-files* --exclude=README \
--exclude=Docs --exclude=CMakeLists.txt \
--exclude=COPYING.LESSER --exclude=INSTALL* \
--exclude=KNOWN_BUGS.txt \
--exclude=cmake* mysql-5.5.29/ mariadb-5.5.29/

This is not to discount the build and test changes that MariaDB have made, but in this case I feel they distort the numbers a bit and I’ve previously just been counting C and C++ code, so it’s probably fairer this way.

We end up with a diffstat of:

 1156 files changed, 326081 insertions(+), 42751 deletions(-)

If we then exclude the copyright notice changes and any whitespace by changing the start of the diff command to this:

diff -NruiEbwB --ignore-matching-lines='Copyright.*Monty' \
--ignore-matching-lines='Copyright.*Oracle'

We end up with a diffstat of:

 1129 files changed, 322821 insertions(+), 39588 deletions(-)

Which is a little different to what I found in my previous post (MariaDB code size) that just used sloccount. There we found that MariaDB 5.5 was 187,000 more lines of code than MySQL 5.5 while here we find the difference to be 283,000 lines of code. I suspect these differences to be in how diff and sloccount count things. If you do a naive count of the number of lines in source files in the sql/ directory you get 375kLOC while sloccount says 256kLOC.

There is still some noise in this number as there’s some Copyright notices for some of the strings code that changes, but this doesn’t seem to be too much. What about server code though? If we just diffstat the sql/ directory (core server code), then we get:

 250 files changed, 83639 insertions(+), 23090 deletions(-)

Which is still nothing to sneeze at, sloccount tells me that MySQL 5.5.29 only has 256kLOC in the sql/ directory to begin with and a naive wc count to be about 375kLOC.

Which is bigger: MySQL or PostgreSQL?

From my previous posts, we have some numbers (excluding NDB) for the size of MySQL, so what about PostgreSQL? Here, I used PostgreSQL git trunk and classing things in the contrib/ directory as plugins. I put the number of lines of code in the src/backend/storage directory down as storage engines LoC but did not count it as non-kernel code.

Version Total LoC Plugin LoC Storage Engines LoC Remaining (kernel)
MySQL 5.5.30 858,441 2,706 171,009 684,726 (79% kernel)
MySQL 5.6.10 1,049,344 29,122 236,067 784,155 (74% kernel)
MariaDB 5.5 1,142,118 11,781 304,015 826,322 (72% kernel)
Drizzle trunk 334,810 31,150 130,727 172,933 (51% kernel)
PostgreSQL trunk 648,691 61,934 17,802 586,757 (90% kernel)

What we can see is that the PostgreSQL kernel size is actually smaller than any recent MySQL version (5.1 was slightly smaller). This is rather interesting as it is generally thought that PostgreSQL does more than MySQL. What’s more telling is that total code size, PostgreSQL is about half of MySQL 5.6 or MariaDB 5.5. Only Drizzle ends up being smaller, which makes sense as it “does less”.

Is MySQL bigger than Linux?

I’m going to take the numbers from my previous post, MySQL Modularity, Are We There Yet? for the “kernel” size of MySQL – that is, everything that isn’t a plugin or storage engine.

For Linux kernel, I’m just going to use the a-bit-old git tree I have on my laptop. I’ve decided that the following directories are for “plugins” drivers/ arch/ sound/ firmware/ crypto/ usr/ virt/ tools/ scripts/ fs/*/* and everything else is core kernel code.

Version Total LoC Total Plugin LoC Remaining (kernel)
MySQL 5.6.10 1,049,344 265,189 784,155 (74% kernel)
MariaDB 5.5 1,142,118 315,796 826,322 (72% kernel)
Linux 9,983,269 8,824,121 1,159,148 (11% kernel)

The scary thing is that it’s surprisingly close, MySQL/MariaDB core is roughly 68-71% the  size of the Linux kernel. This is probably an unfairly large number for Linux too as there’s much more of Linux that is pluggable and modular… so I actually suspect they’re closer to exactly the same size.

If we look at the net/ directory in linux, it’s a grand total of 493,000 lines of code, all of which is fairly modular and independent. You could, quite reasonably, claim that the core of Linux is in fact closer to half a million lines of code than a million, making MySQL significantly larger.

So how many engineers are looking after each code base? We know there are over a thousand Linux kernel developers contributing to each release (e.g. https://lwn.net/Articles/395961/ for data back in 2010, and https://lwn.net/Articles/537110/ for Feb 2013).

I’m now going to fudge some things to attempt to work out how many “developers” are working on linux core code rather than drivers and arch specific things. I work out there’s probably about 20-25% of linux developers who work on things that are not drivers, filesystems or arch code. This is around 250-300 developers for each kernel release.

So… how many people have ever committed code to MySQL? This is fairly easy to find out: I simply looked at the entire bzr history, grepped out every committer and then uniqued the list (this required more than just sort -u as people used different email addresses and names). How many people have ever committed code to MySQL (i.e. their code can be found in the MySQL 5.6 bzr tree)? 312.

How many committers to MySQL 5.6 are there? 161. This is pretty amazing, that’s about half of what the total is. However, this number is misleading. For example, my name is there and the last commit to the MySQL tree from me was in 2008. You also see names such as Monty Taylor and Kristian Nielsen – all three of us not having worked for MySQL/Sun/Oracle for a great number of years. At the very least, there’s been a lot of code integration into MySQL 5.6 from many existing sources that were not previously in MySQL trunk.

MySQL modularity, are we there yet?

MySQL is now over four times the size than it was with MySQL 3.23. This has not come in the shape of plugins.

Have we improved modularity over time? I decided to take LoC count for plugins and storage engines (in the case of Drizzle, memory, myisam and innobase are storage engines and everything else comes under plugin). I’ve excluded NDB from these numbers as it is rather massive and is pretty much still a separate thing.

Version Total LoC Plugin LoC Storage Engines LoC Remaining (kernel)
MySQL 3.23.58 371,987 0 (0%) 176,276 195,711 (52% kernel)
MySQL 5.1.68 721,331 228 237,124 483,979 (67% kernel)
MySQL 5.5.30 858,441 2,706 171,009 684,726 (79% kernel)
MySQL 5.6.10 1,049,344 29,122 236,067 784,155 (74% kernel)
MariaDB 5.5 1,142,118 11,781 304,015 826,322 (72% kernel)
Drizzle trunk 334,810 31,150 130,727 172,933 (51% kernel)

I’ve used the non-plugin and non-storage engine code size to be the database “kernel” – i.e. the core of the database server.

What I find really interesting here is that yes, the amount of code that is to some degree modular has increased. The amount of code that is a MySQL plugin is still very small compared to the server size

Drizzle is 20-25% of the size of a modern MySQL or MariaDB server and for many applications does largely or exactly the same thing.

Other MySQL branch code sizes

Continuing on from my previous posts, MySQL code size over releases and MariaDB code size I’ve decided to also look into some other code branches. I’ve used the same methodology as my previous few posts: sloccount for C and C++ code only.

There are also other branches around in pretty widespread use (if only within a single company). I grabbed the Google, Facebook and Twitter patches and examined them too, along with Percona Server 5.1 and 5.5.

Codebase LoC (C, C++) +/- from MySQL
Google v4 patch 5.0.37 970,110 +26,378 (from MySQL 5.0.37)
MySQL@Facebook 1,087,715 +15,768 (from MySQL 5.1.52)
Twitter 5.5.29.t10 1,192,718 +3,624
Percona Server 5.1 trunk 1,066,418 +14,878 (from MySQL 5.1.66)
Percona Server 5.5 trunk 1,208,577 +19,483 (from MySQL 5.5.29) +142,159 (from PS 5.1)
Drizzle trunk 334,810

The Google patch has always had a reputation of being large, and with an extra 26kLOC of code, it certainly is the biggest of any of the more current branches – and that’s actually a surprise to me that it adds this much code.

The Facebook and Percona Server 5.1 branches are amazingly similar in how much extra code they add, and they’re not carbon copies of each other. The Twitter patch quite notable for how little extra code it adds.

For giggles, I included Drizzle – which is (even with all the plugins) less than a third of the size of MySQL 5.1.

It’s clear that the Percona Server and Facebook patches introduce much less code than MariaDB does, which does go with the general wisdom of them being closer to Oracle MySQL than MariaDB is.

If we look at Percona Server, we see that with Percona Server 5.5 there is indeed a bunch more code than was in Percona Server 5.1, with roughly 5,000 more lines of code than we’d expect from a simple port from MySQL 5.1 to MySQL 5.5. This feels about right, we’ve added new things to Percona Server 5.5 that weren’t in Percona Server 5.1.

MariaDB code size

Continuing on from my previous post, MySQL code size over releases.

I wanted to look at the different branches/patch sets of MySQL out there and work out how far from upstream they deviated. I’m just going to compare against whatever upstream version the most easily accessible version is based on (be it 5.0.x, 5.1.x or whatever).

For MariaDB versions, I removed innodb_plugin and replaced it with xtradb for stats purposes as the MariaDB innodb_plugin is essentially the same as upstream and I don’t want to artificially inflate the diff size.

The first three major versions of MariaDB were all based on MySQL 5.1. I used sloccount and only counted C and C++ code.

So, let’s look at some of the MySQL patch sets/branches that are around. Firstly, let’s look at MariaDB:

Codebase LoC (C, C++) +/- from MySQL +/- from prev maj Version
MariaDB 5.1 1,210,168 +157,532 0
MariaDB 5.2 1,227,434 +174,798 +17,266 (since MariaDB 5.1)
MariaDB 5.3 1,264,995 +212,359 +37,561 (since MariaDB 5.2)
MariaDB 5.5 1,377,405 +187,658 (from MySQL 5.5) +112,410 (since MariaDB 5.3)

From my previous post on lines of code in MySQL versions, we learned that with MySQL 5.6 we saw a 354kLOC increase over MySQL 5.5. What is quite surprising is how close some of the MariaDB differences are to this. With MariaDB 5.5, we’re looking at a 187kLOC difference, which is roughly two thirds that of MySQL 5.6. What’s also interesting is that each incremental MariaDB release has not added nearly as much code as the MySQL 5.1 to 5.5 and 5.5 to 5.6 jumps did.

MariaDB LoC over major versions

The MariaDB code size has also been increasing, if we look at the graph above  you can really see the jump in code size over the past few releases.

If we look at the delta between MariaDB and MySQL, the first MariaDB release (MariaDB 5.1) was certainly a large jump. Each incremental MariaDB release (5.2 and 5.3) have been a smaller delta than the initial one. With MariaDB 5.5 we actually decrease the delta from MySQL, which is something that’s interesting to look at.

If we were going a straight port of MariaDB 5.3 to be based off MySQL 5.5, we’d expect the delta to be around 137kLOC (what MySQL 5.1 to 5.5 is) but it isn’t. The difference to MariaDB 5.5 from MariaDB 5.3 is only ~112kLOC, and the on the whole delta decreases.

But what makes up this big initial jump for MariaDB? Let’s look at some of the MariaDB 5.1 only modules and what’s left:

MariaDB 5.1 component LoC (MariaDB 5.1)
PBXT 45,107
FederatedX 3,076
IBM DB2i 13,486
Total 61,669
Other 95,863

So the MariaDB delta is not increase just because they included some existing modules, there’s more code in there, about as much as any major MySQL version bump.

Tomorrow we look at other MySQL branches, and we see that the MariaDB delta truly is significantly larger than any other MySQL branch.

MySQL code size over releases

As the start of a bit of a delve into the various MySQL branches and patch sets that have been around, let’s start looking at the history of MySQL itself. This is how big MySQL has been over all of the major releases since the beginning (where beginning=3.23). (edit: These numbers were all gathered using sloccount and only counting C++ and C source files.)

Codebase LoC (C, C++) +/- from previous MySQL
MySQL 3.23.58 371,987 0
MySQL 4.0.30 368,695 -3,292 (from MySQL 3.23)
MySQL 4.1.24 859,572 +490,877 (from MySQL 4.0)
+174,352 excluding NDB
MySQL 5.0.96 916,667 +57,095 (from MySQL 4.1)
MySQL 5.1.68 1,052,636 +135,969 (from MySQL 5.0)
MySQL 5.5.30 1,189,747 +137,111 (from MySQL 5.1)
MySQL 5.6.10 1,544,202 +354,455 (from MySQL 5.5)
increase in MySQL source code size over version

MySQL code size over major versions

Note the sharp increase in 5.6

LoC Delta for major MySQL release

We can see that MySQL has had some interesting code size changes over time, the big jump in 4.1 over 4.0 was mostly due to the introduction of MySQL Cluster, but even so, it was a big jump.

MySQL 5.6 is the largest MySQL code size increase in a MySQL version ever. The last time we saw anything like this was with the merging of MySQL Cluster in 4.1. At the very least, Oracle is paying people to write lines of code to extent that nobody has before.

Sessions at the Percona Live MySQL Conference that interest me

For the past many years, there’s been a conference in April, at the Santa Clara Convention Centre where the topic has been MySQL and the surrounding ecosystem. The first year I went, I gave a talk on the new features in MySQL Cluster 5.1 to a overflowing room of attendees. For me, it’s an event that’s mixed with speaking about something I’ve been working on and talking to other attendees about everything from how a particular part of the server works to where we can escape to for nearby good vegan food.

So, I thought I’d share some of the sessions that I’m really looking forward to. My selection is probably atypical, but may be interesting to others. I’m not going to list the keynotes, although they are often of a lot of value. I’m also going to attempt to avoid listing a few really awesome well known speakers simply because there are other really interesting sessions that also need exposure!

  • Starring Sakila: Building Data Warehouses and BI solutions using MySQL and Pentaho
    I need to base decisions off data, not simply a gut feeling (I’m not Stephen Colbert after all). I ran into a bunch of stumbling blocks when trying to work with Pentaho a couple of weeks ago, and I’m really hoping that this session shines some light on how to use it to better and more easily make arguments based on evidence to others in the company.
  • Testing MySQL Databases: The State Of The Art
    I’ve worked with Patrick for several years now, and he’s currently a valuable member of my team at Percona. For those who are interested in the state of the art of open source database testing, this is the session to be in.
  • Getting InnoDB Compression Ready for Facebook Scale
    This session is on at the same time as I’m speaking, so I probably won’t be able to attend (people keep coming to my sessions so I usually can’t sneak out). I’m really interested in how they’ve modified the compression code to help with their (large) workload.
  • Backing Up Facebook
    I hear that Facebook has a couple of database servers, a few dozen users and a few floppy disks full of data. This should be a fun story :)
  • Introducing XtraBackup Manager
    Being responsible for XtraBackup development at Percona, the XtraBackup topics really interest me. Lachlan has been working on a simple backup manager for XtraBackup to help create something that is a more complete backup solution than a tool which simply creates a backup.
  • Extending Xtrabackup – A Point-In-Time System
    Another good case of using XtraBackup as part of a comprehensive backup strategy. I have to be honest, I’m looking for ways in which we can improve XtraBackup to better fit the needs of people. It may be that there are a few small things we can do to make it easier for people do deploy and use.
  • Getting Started with Drizzle 7.1
    We’re about to do the 7.1 release of Drizzle! If you’re interested in having a SQL database that is designed to be used in large scale web applications and cloud environments, come along to this talk.
  • MySQL Idiosyncrasies That Bite
    I have to admit, I’m interested in Ronalds talk here to basically ensure we didn’t miss fixing anything in Drizzle. I do promise not to at any point yell out “Fixed in Drizzle” though.

Go here to register: http://www.percona.com/live/mysql-conference-2012/ (early bird pricing and discounted hotel rooms end March 12th, so you want to register sooner rather than later).

Puppet + Vagrant + jenkins = automated bliss

I’m currently teaching myself how to do Puppet. Why? Well, at Percona we support a bunch of platforms for our software. This means we have to maintain a bunch of Jenkins slaves to build the software on. We want to add new machines and have (up until now) maintained a magic “apt-get install” command line in the Jenkins EC2 configuration. This isn’t an ideal situation and there’s been talk of getting Puppet to do the heavy lifting for a while.

So I sat down to do it.

Step 1: take the “apt-get install” line and convert it into puppet speak.

This was pretty easy. I started off with Vagrant starting a Ubuntu Lucid 32 VM (just like in the Vagrant getting started guide) and enabled the provision using puppet bit.

Step 2: find out you need to run “apt-get update”

Since the base VM I’m using was made there had been updates, so I needed to make any package installation depend on running “apt-get update” to ensure I was both installing the latest version and that the repositories would have the files I was looking for.

This was pretty easy (once I knew how):

exec {"apt-update":
       command => "/usr/bin/apt-get update",
}
Exec["apt-update"] -> Package <| |>

This simply does two things: specify to run “apt-get update” and then specify that any package install depends on having run “apt-update” first.

I’ve also needed things such as:

case $operatingsystem {
     debian, ubuntu: { $libaiodev = "libaio-dev" }
     centos, redhat: { $libaiodev = "aio-devel" }
     default: { fail("Unrecognised OS for libaio-dev") }
}
package { "libaio-dev":
          name => $libaiodev,
          ensure => latest,
}

The idea being that when I go and test all this stuff running on CentOS, it should mostly “just work” there too.

The next step? Setting up and running the Jenkins slave.

Information on Bug#12704861 (which doesn’t exist in any public bug tracker)

Some of you may be aware that MySQL is increasingly using an Oracle-internal bug tracker. You can see these large bug numbers mentioned alongside smaller public bug numbers in recent MySQL release notes. If you’re particularly unlucky, you  just get a big Oracle-internal bug number. For a recently fixed bug, I dug further, posted up on the Percona blog: http://www.mysqlperformanceblog.com/2011/11/20/bug12704861/

Possibly interesting reading for those of you who interested in InnoDB, MySQL, BLOBs and crash recovery.

Speaking on Tuesday: HailDB and Dropping ACID: Eating Data in a Web 2.0 Cloud World

I’m giving two talks tomorrow (Tuesday) at the MySQL Conference and Expo:

HailDB: A NoSQL API direct to InnoDB, 2:00pm, Ballroom D

Dropping ACID: Eating Data In A Web 2.0 Cloud World 3:05pm, Ballroom G

The HailDB talk is all about a C API to embed an InnoDB based relational database engine into your application. Awesome stuff (also nice and technical).

The second talk, “Dropping ACID: Eating Data in a Web 2.0 Cloud World” is not only a joke that only database people get, but a humorous and serious look at data integrity and reliability as promised by the current hype. This was quite well received at linux.conf.au in January. So, if you weren’t in Australia in January this year, then certainly come along and see how you go heckling an Australian.

innodb and memcached

I had a quick look at the source tree (I haven’t compiled it, just read the source – that’s what I do. I challenge any C/C++ compiler to keep up with my brain!) that’s got a tarball up on labs.mysql.com for the memcached interface to innodb. A few quick thoughts:

  • Where’s the Bazaar tree on launchpad? I hate pulling tarballs, following the dev tree is much more interesting from a tech perspective (especially for early development releases). I note that the NDB memcached stuff is up on launchpad now, so yay there. I would love it if the InnoDB team in general was much more open with development, especially with having source trees up on launchpad.
  • It embeds a copy of the memcached server engines branch into the MySQL tree. This is probably the correct way to go. There is no real sense in re-implementing the protocol and network stack (this is about half what memcached is anyway).
  • The copy of the memcached engine branch seems to be a few months old.
  • The current documentation appears to be the source code.
  • The innodb_memcached plugin embeds a memcached server using an API to InnoDB inside the MySQL server process (basically so it can access the same instance of InnoDB as a running MySQL server).
  • There’s a bit of something that kind-of looks similar to the Embedded InnoDB (now HailDB) API being used to link InnoDB and memcached together. I can understand why they didn’t go through the MySQL handler interface… this would be bracing to say the least to get correct. InnoDB APIs, much more likely to have fewer bugs.
  • If this accepted JSON and spat it back out… how fast would MongoDB die? weeks? months?
  • The above dot point would be a lot more likely if adding a column to an InnoDB table didn’t involve epic amounts of IO.
  • I’ve been wanting a good memcached protocol inside Drizzle, we have ,of course, focused on stability of what we do have first. That being said…. upgrade my flight home so I can open a laptop… probably be done well before I land….. (assuming I don’t get to it in the 15 other awesome things I want to hack on this week)

Drizzle online backup with xtrabackup

For backups, historically in the MySQL world you’ve had mysqldump (a SQL dump, means on restore you have to rebuild indexes), InnoDB Hot Backup (proprietary, but takes a copy of the InnoDB data files, so restore is much quicker), LVM snapshots (various scripts exist, does have larger IO impact, requires LVM) and more recently xtrabackup. Xtrabackup essentially does the same thing as InnoDB hot backup except that it’s free and open source software.

Many people have been using xtrabackup successfully for quite a while now.

In Drizzle7, our default storage engine is InnoDB. There have been a few changes, but it is totally InnoDB. This leaves us with the question of backup solutions. We have drizzledump (the Drizzle equivalent to MySQL dump – although with fewer gotchas), you could always use LVM snapshots and the probability of Oracle releasing InnoDB Hot Backup for Drizzle is rather minimal.

So enter xtrabackup as a possible solution… I had though of porting xtrabackup across for a while. Last weekend, while waiting for one of my iterations of catalog support to compile, I decided to give it a go. I wanted to see how far I could get with it also in that weekend.

I was successful – there’s a tree up at lp:~stewart/drizzle/xtrabackup thatproduces an xtrabackup binary that’s built for Drizzle (it’s not quite ready for merging yet, there are some obivous bugs around command line option parsing… but a backup and restore did work).

I wanted the following:

  • build to be integrated with Drizzle, using the same innobase build that we use to build the server
  • build with strict compiler warnings and -Werror (which we do forDrizzle)
  • build with a C++ compiler (as we do with innobase in Drizzle)
  • not re-add parts of mysys into the Drizzle build just for xtrabackup

I’ve already submitted merge requests to upstream xtrabackup containing the compiler fixes and added compiler warnings (they’ve also by now been merged into xtrabackup). Already my work has improved the quality of xtrabackup for everyone. Some of the warnings were fixed slightly differently in xtrabackup than in my Drizzle tree, but I plan to merge.

One issue was that the command line parsing library that xtrabackup uses – my_getopt which is part of mysys (the portability library inside MySQL) is long since gone from Drizzle. We currently use Boost::program_options. Thanks to the heroic efforts of Andrew Hutchings, xtrabackp in Drizzle is also using boost::program_options. This was a brilliant “hey, can you have a look at this conversion” followed by handing him a tree that did not even remotely compile, followed by a “I have to take the kids somewhere, here’s a tree – it may compile”. Amazingly enough, it pretty much did compile once I fixed the other issues.

An unresolved issue is how to deal with this going forward – my guess is that upstream xtrabackup doesn’t want to require Boost.

One solution could be just to factor out command line options into a sepfile that we can ignore for Drizzle and replace with our own. The other option could be to use a differnt command line option parsing library (perhaps from CCAN, as it’s then maintained by somebody else and doesn’t require heaps and heaps of other stuff).

Another issue I had to tackle is the patch to innobase that’s required to build xtrabackup.

I took a very minimal approach for the Drizzle patch. We are currently based on innobase 1.1.4 from MySQL 5.5 – so I mostly looked at the xtradb55 patch. I think it would be great if these were instead of one giant patch a series of patches to apply (a-la quilt) to a) make iteasier maintain and b) easier for myself to work out the exact reasoning of each bit (also, generating the patches with -p would help a fair bit too).

So how did I do it?

Step 0
was removing support for old innobase – we totally don’t need it for Drizzle.

Step 1
was creating a srv_read_only option for Drizzle’s innobase. This was fairly easy. The one thing I did have to change was adding a checkin os_file_lock() so that we don’t attempt to write lock the ibdatafiles when in read only (otherwise backups can’t be taken while drizzledis running). I’m a little surprised that this wasn’t hit in 5.5 at all.

Step 2
was implementing srv_fake_write. I’m pretty sure I’ve gotten this right in the Drizzle implementation, but the patch wasn’t as easy toread as I’d really like. I probably need to do a bit more of a code audit that this is actually correct (I may try and come up with anLD_PRELOAD library that will scream loudly if writes are made to files matching a pattern).

Step 3
was implemnting srv_apply_log_only. Pretty sure I have this right, again, more testing will be required. Why? Because I’m that paranoid about getting things very, very right.

Step 4
was to go through all the functions that xtrabackup needed to not be static. Instead of having prototypes for them inxtrabackup.cc, I instead added a xtrabackup_api.h header to Innobase and included it where needed (including in xtrabackup). I’d recommend this way going forward for xtrabackup too as it could be a lot less problematic to maintain (and makes xtrabackup source a bit easier to read)

Step 5
was fixing up a few skeleton functions that were needed to make our innobase happy. It may not be a bad idea to split out the skeleton functions into a sep source file so it’s a bit easier to track (and some #ifdefs around those not needed for certain releases).

I’m hoping to work with the upstream xtrabackup devs on the various points I’ve made above.
Another thought of mine is to port xtrabackup into HailDB where we can use much more neat API functions to create good tests for xtrabackup.

Thanks go out to all who’ve worked on xtrabackup. It honestly wasn’t too hardgetting it ported across to Drizzle – and with a bit of collaboration I think we can make it easy to keep up to date.

What’s the future for Xtrabackup in Drizzle? It’ll likely end up being a binary named drizzlebackup-innobase or similar (this means that there is a clear difference between xtrabackup for MySQL and what we have in Drizzle – which is more accurately defined as based on xtrabackup). We’ll also probably want a nice wrapper or integration with a backup tool to deal with everything Drizzle related. We shall also introduce a lot of testing; backups are important.

Xtrabackup is topical, check out the latest OurSQL podcast and the the Percona Xtrabackup website for more info!