Run Backup, Run!

Over the past N weeks/couple of months, we’ve been making a number of improvements to how backups are done in MySQL Cluster.

Once you get to large data sets, you start to really care about how long a backup takes.

Traditionally, MySQL Cluster has been in-memory only. The way to back this up is to just write from memory to disk (rate limited) and synchronised across the cluster.  Since memory is really fast (compared to the rate we’re writing out to disk) – never had a problem.

In MySQL 5.1 (and Cluster Carrier Grade Edition- CGE), disk based attributes are supported. This means that a row has both in memory and disk based parts.  As we all (should) know, disk seeks take a very long time. We don’t want to seek.

So, at some point recently we changed the scanning order from in-memory order (which previously made perfect sense) to on disk order. Randomly seeking through RAM is much cheaper than all the disk seeks. This greatly improved backup performance.

We also did some read-ahead work, which again, greatly improved performance.

Today, I see mail from Jonas about changing the way we read tuples for backup (and LCP) to make it even more efficient (READ_PACKED). This should also reduce CPU usage for LCP/Backup… which is a casual issue. I should really take the time to look closely at this and review.

I also wrote a patch to the code in NDB that writes files to disk to write a compressed gzio stream instead of an uncompressed one. This happens in a different thread, so potentially using one of those CPU cores that ndb wouldn’t otherwise use… and also dramatically reducing the amount of data written to disk…. this patch isn’t in any tree yet, and I’ve yet to try it with the READ_PACKED patch, which together should work rather well.

I also need to grab Brian at some point and find out why azio (as used by the ARCHIVE engine) doesn’t work the same way as gzio for basic stream writing…