Pluggable Metadata stores (or… the revenge of table discovery)

Users of the ARCHIVE or NDB storage engines in MySQL may be aware of a MySQL feature known as “table discovery”. For ARCHIVE, you can copy the archive data file around between servers and it magically works (you don’t need to copy the FRM). For MySQL Cluster (NDB) it works so that when you CREATE TABLE on another MySQL server,  other MySQL servers can get the FRM for these tables from the cluster.

With my work to replace the FRM with a protobuf structure in Drizzle and clean up parts of the API around it, this feature didn’t really survive in any working state.

Instead, I’m now doing things closer to the right way: pluggable metadata stores. The idea being that the whole “table proto on disk” (in MySQL it’s the FRM, but in Drizzle we’re now using a protobuf structure) code is pluggable and could be replaced by an implementation specific to an engine (e.g. the innodb or ndb data dictionaries) or a different gerenic one.

Currently, the default plugin is the same way we’ve been doing it forever: file-per-table on disk in a directory that’s the database. The API has a nasty bit now (mmmm… table name encoding), but that’ll be fixed in the future.

The rest of this week will be dedicated to plugging this into all the bits in the server that manipulate the files manually.

With luck, I’ll have modified the ARCHIVE engine by then too so that there’ll just be the archive data file on disk with the table metadata stored in it.

Save the Devil: it’s what the cool kids are doing

At linux.conf.au and now Dreamhost are doing a $50 discount and $50 to the devil deal.

Money going to real research – on an infectious cancer that is fatal to the Devils.

We managed to raise an amazing amount of money at linux.conf.au for the Devils (expect a press release with the final tallies real-soon-now, as the last of the pledges is trickling into our bank account).

So save a cartoon character and if you haven’t already, head to tassiedevil.com.au to find out what you can do.

MySQL Storage Engine SLOCCount over releases

For a bit more info, what about various storage engines over MySQL releases. Have they changed much? Here we’re looking at the storage/X/ directory for code, so for some engines this excludes the handler that interfaces with the MySQL Server.

You can view the data on the spreadsheet.

NDB Kernel size over releases

So Jonas pointed out that the NDB kernel hasn’t changed too much in size over releases. Let’s have a look:

In fact, the size went down slightly from 4.1 to 5.0. In this, 6.4 and 7.0 are the same thing but appear twice for completeness.

You can see the raw results in the spreadsheet here.

Size of Storage Engines

For whatever reason, let’s look at “Total Physical Source Lines of Code” from a recent mysql-6.0 tree (and PBXT from PBXT source repo):

See the spreadsheet here.

Raw data:

Blackhole        336
CSV             1143
Archive         2960
MyISAM         34019
PBXT           41732
Maria          69019
InnoDB         82557
Falcon         91158
NDB           365272

NDB has a 100,000 line test suite.

PBXT supports MySQL and Drizzle.

Conclusions to draw? Err… none really.

Congratulations Sheeri on having the book out!

The MySQL Administrator’s Bible is out. Writing a book is not something you can just squeeze into a Sunday afternoon; it takes real dedication and more effort than you could possibly imagine.

So congrats on having the book for MySQL DBAs (and I’d venture to say application devs should also be reading it) out and on Amazon so people can buy it now.

Does linux fallocate() zero-fill?

In an email disscussion for pre-allocating binlogs for MySQL (something we’ll likely have to do for Drizzle and replication), Yoshinori brought up the excellent point of that in some situations you don’t want to be doing zero-fill as getting up and running quickly is the most important thing.

So what does Linux do? Does it zero-fill, or behave sensibly and pre-allocate quickly?

Let’s look at hte kernel:

Inside the fallocate implementation (fs/open.c):

if (inode->i_op->fallocate)
ret = inode->i_op->fallocate(inode, mode, offset, len);
else
ret = -EOPNOTSUPP;

and for ext4:
/*
* currently supporting (pre)allocate mode for extent-based
* files _only_
*/
if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL))
return -EOPNOTSUPP;

XFS has always done fast pre-allocate, so it’s not a problem there and the only other filesystems to currently support fallocate are btrfs and ocfs2 – which we don’t even have to start worrying too much about yet :)

But this is just kernel behaviour – i *think* libc ends up wrapping it
with a ENOTSUPP from kernel being “let me zero-fill” (which might be
useful to check). Anybody want to check libc for me?

This was all on slightly post 2.6.30-rc3 (from git: 8c9ed899b44c19e81859fbb0e9d659fe2f8630fc)

c++ stl bitset only useful for known-at-compile-time number of bits

Found in the libstdc++ docs:

Extremely weird solutions. If you have access to the compiler and linker at runtime, you can do something insane, like figuring out just how many bits you need, then writing a temporary source code file. That file contains an instantiation of bitset for the required number of bits, inside some wrapper functions with unchanging signatures. Have your program then call the compiler on that file using Position Independent Code, then open the newly-created object file and load those wrapper functions. You’ll have an instantiation of bitset for the exact N that you need at the time. Don’t forget to delete the temporary files. (Yes, this can be, and has been, done.)

Oh yeah – feel the love.

Brought to you by the stl-is-often-worse-for-you-than-meth dept.