Cursor states

Following on from my post yesterday on the various states of a Storage Engine, I said I’d have a go with the Cursor object too. A Cursor is used by the Drizzle kernel to get and set data in a table. There can be more than one cursor open at once, and more than one per thread. If your engine cannot cope with this, it is its responsibility to figure it out and return the appropriate errors.

Let’s look at a really simple operation, inserting a couple of rows and then reading them back via a full table scan.

Now, this graph is slightly incomplete as there is no doEndTableScan() call. But you can see in which order things are meant to happen. In this case, “store_lock()” means that store_lock() has been called, so when coming back from doInsertRecord() we do not call store_lock() again, rather, we’re just in a state where it has already been executed.

For MySQL handler, think ::write_row() for doInsertRecord() and ::rnd_init() for doStartTableScan().

This diagram was again auto-generated from my test engine.

Rackspace Rookie-O (in Hong Kong!)

I’d meant to finish writing this way back in July… but I failed at that. Now is a good time to talk about Rookie-O as my again new colleague Andrew Hutchings (Buy his and Sergei’s book on MySQL 5.1 Plugin Development!) just went through the same thing (but in London instead of Hong Kong) given by the same trainer (Hi Eddie!).

Rackspace is the second employer I’ve had that has some kind of new hire training (the first being Sun). I am, of course, not quite counting Salmiakki as new-hire training for MySQL (although I probably should). To quote from the Wikipedia article: “Although the rumor of the heart attack was a hoax, the drink may still cause harm. The strong flavor almost completely masks the presence of ethanol, and the drinker may not realize he is consuming a drink almost 40% alcohol by volume (80-proof), leading to possible alcohol poisoning.” A promising introduction to the company.

Monty, MÃ¥rten and Kaj with Salmiakki singing Helan GÃ¥r at the MySQL User Conference Japan in 2007

Monty, MÃ¥rten and Kaj with Salmiakki singing Helan GÃ¥r at the MySQL User Conference Japan in 2007

I could possibly say something about the Sun New-Hire training… but I’m just trying to find something positive to say – and I can’t. I got a bit of hacking done? Seriously.

Actually coordinating a time to attend a Rookie-O (Rookie Orientation, the Rackspace name for new hire training) was rather tricky. There was one right before the MySQL User Conference back in April (not the best of timing), one during an upcoming team meeting (again, not ideal) and one that got organised in the middle of everything for the office in Hong Kong. So, I headed to Hong Kong.

Hong Kong streetlife

The Hong Kong office is relatively new (late 2008) and there were people there who hadn’t gone through the standard Rackspace Rookie-O (Orientation).

Rackers walking Hong Kong at Night

It was rather cool to hang out with other people who worked for the company – and in totally different areas than I do. I did get a better understanding for how the rest of the company operates and the people involved. The training itself was useful and substantially less geared towards not-my-job than Sun’s was.

The good news is that Andrew thought it was useful too. Pretty impressed so far.

Storage Engine API state graph

Drizzle still has a number of quirks inherited from the MySQL Storage Engine API (e.g. BLOBs, row buffer, CREATE SELECT and lack of DDL transaction boundaries, key tuple format). One of the things we fixed a long time ago was to have proper methods for StorageEngines to be called for: startTransaction, startStatement, endStatement, commit and rollback.

If you’ve had to implement a transactional storage engine in MySQL you will be well aware of the pattern of “in every Storage Engine/handler call: if transaction doesn’t exist, begin.” We’ve tried to fix this in the Drizzle API for a number of reasons. I think having this obvious set of calls will make the API a lot easier to understand. I am also very interested in making things much easier to prove correct.

A while ago I spotted Bug 587772, which was the READ COMMITTED isolation level not working correctly with InnoDB. It turns out that the most basic example for READ COMMITTED failed. Hrrm… this is no good. It worked on MySQL, so this was certainly something that we broke. What was more worrying is that there wasn’t a test for this in the test suite (and at the time I couldn’t find one in the MySQL test suite either, so I think we inherited the missing test).

I recently started delving in, actually going to solve this. I noticed something worrying, endStatement wasn’t being called, which is where the innobase plugin would release the read view that it used for the statement. You’d think that it would grab a new one on startStatement, but because of the previous design of the API (remember “if txn isn’t started, start it!”) this also happened for getting the read view for the statement… so we instead got a REPEATABLE READ isolation level.

I wanted a test.

Previously, I’ve created a dummy storage engine (tableprototester) and used it to test the server code for reading the table protobuf message. I thought about doing a Storage Engine for this problem too, basically looking at the calls to the Storage Engine as transitions between states in a state machine.

A basic view of a transaction could be:

State transitions for a transaction. Transaction can be empty OR have one or more statementsThat is, a transaction starts and has zero or more statements before it commits or gets rolled back.

By coding up a data structure of allowable state transitions, a small function to assert() on invalid transitions and enough of the boilerplate to make the engine “work”, I was able to hit an assert() exactly where I’d expected it: at an invalid transition from START STATEMENT to COMMIT.

To fix the initial bug (READ COMMITTED not working), I filled in a few state transitions for the system as a whole that aren’t quite correct. From the diagram below, you can quite obviously see where the obvious bugs are (it helps that I’ve coloured them red):

There is absolutely no sense in going BEGIN -> END STATEMENT or immediately to COMMIT. These should be relatively easy to solve too, but are separate bugs.

I wish to expand this in the future to cover Cursor as well. It will also be useful to ensure that DDL can be wrapped in transactions. Not to mention the last few HTON flags that exist (and should likely go away).

To generate the diagrams, I just wrote a little utility to dump out the state transitions in dot, using it to generate the diagrams.

Video editing with Free Software

Way back when, for linux.conf.au coming to Melbourne in 2008, I edited together a promo video for it. IIRC the raw video was shot by Kelly on DV tape, imported in and I got a CD of some massive 400MB MPEG file of a bunch of questions. Using Cinelerra and some graphics package that I forget (very early Inkscape?), I managed to get this done in 2006. I understand things are a bit less segfaulty these days.

See it on YouTube or download the Ogg Theora video.

Amazingly enough, this is the last time I actually did any video editing.

You should also go to linux.conf.au 2011 in Brisbane this upcoming January.

Pogoplug as a NAS

A while ago (April) I bought a Pogoplug with the explicit idea of using it as a NAS device. I finally bought a new 2TB drive and plugged in the pogoplug. I pretty much instantly realised I was going to run Debian on it instead, if only because that’s what makes me comfortable: Debian, ssh and XFS.

Installing Debian was easy (google it) and incredible props for not only an attractive looking device, but an easily hackable one (the default software is probably quite good for non-experts… I just happen to want Debian).

I have to say, I’m so far rather happy.

Ubuntu 10.10 biggest mistake: shotwell

This is meant to replace f-spot.

It just isn’t ready.

I do not have what I would consider a large photo collection. It’s about 77GB on disk, maybe 30,000 images.

Importing from f-spot is horrendously slow for what is essentially a few INSERT..SELECT statements. It does not copy your photos anywhere, yet takes about that long.

It eats memory for breakfast. No, really. I bought a digital camera around New Year 2003. In just importing the photos from 2003… It’s now using 800MB of memory… sorry, 900MB now. At final count, at the end of 2003, it’s up to 1.6GB of memory used and an additional 300MB of disk space in ~/.shotwell/thumbs. How on earth is it going to cope when it gets to where I really start shooting? Now, the Shotwell website does state that there was a memory leaking bug that is now fixed in trunk. Note where it isn’t fixed – in Ubuntu.

Ubuntu 10.10 currently ships with an unusable photo manager.

f-spot is nowhere near perfect. Relegating it to universe instead of main (i.e. it’s now “not maintained by Canonical”) is just stupid.

Meanwhile, I still love darktable – it’s simply wonderful.

HailDB being built by default in Drizzle

It just it trunk – if you have HailDB installed when you build Drizzle, you will now get the HailDB plugin built. You can even run Drizzle with it (remove innobase plugin, load HailDB plugin). Previously, we had problems building both due to symbol conflicts between innobase and HailDB. We’ve fixed this thanks to the linker.

So, enjoy HailDB… welll, test it and report bugs that I can fix :)

New APIs in HailDB

In the current HailDB we have a couple of new API calls that you may like:

  • ib_status_get_all()
    Is very similar to ib_cfg_get_all(). This allows the library to add new status variables without applications having to know about them – because we return a list of what there are. For Drizzle, this means that the DATA_DICTIONARY.HAILDB_STATUS table will automatically have any new status variables we add to HailDB without a single extra line of code having to be written.
  • ib_set_panic_handler()
    Having a shared library call exit() is generally considered impolite. Previously, if HailDB hit corruption (or some other nasty conditions), it could call exit() and you’d never get a chance to display a sensible error message to your user (especially bad in a GUI app where the printed to console error message would be unseen). This call allows an application to specify a callback in the case of HailDB entering such a condition. We’ll still be unable to continue (and we strongly advise that you do in fact exit the process in your callback) but you’re at least now able to (for example) pop up a dialog box saying sorry.
  • ib_trx_set_client_data()
    This call lets you associate a void* with a transaction. HailDB keeps this pointer in its transaction data structure and in some callbacks (e.g. ib_set_trx_is_interrupted_handler(), see below) will pass this pointer back to you for you to use to help make a decision. In InnoDB in MySQL, this is the THD. In Drizzle, it’s the Session.
  • ib_set_trx_is_interrupted_handler()
    In various wait conditions (e.g. waiting for a row lock), HailDB will call the callback you set with this function with the client data (set with ib_trx_set_client_data()) to work out if the transaction has been cancelled. This enables an application to implement something like the MySQL/Drizzle KILL command to cancel a transaction in another thread.
  • ib_get_duplicate_key()
    If you just got a duplicate key error, this function will tell you what key it was. This allows you to implement a nicer error message.
  • ib_get_table_statistics()
    This function gives you access to some basic table statistics that HailDB maintains. This includes an approximate row count, clustered index size, total of secondary indexes as well as a “modified counter” which can give you a rough idea about how out of date these statistics are.

All of these are new to HailDB (and weren’t available in embedded_innodb), many in the new 2.3 development release. You can see usage examples both in the HailDB test suite and (for most of them) in the Drizzle HailDB Storage Engine.

Second Drizzle Beta (and InnoDB update)

We just released the latest Drizzle tarball (2010-10-11 milestone). There are a whole bunch of bug fixes, but there are two things that are interesting from a storage engine point of view:

  • The Innobase plugin is now based on innodb_plugin 1.0.6
  • The embedded_innodb engine is now named HailDB and requires HailDB, it can no longer be built with embedded_innodb.

Those of you following Drizzle fairly closely have probably noticed that we’ve lagged behind in InnoDB versions. I’m actively working on fixing that – both for the innobase plugin and for the HailDB library.

If building the HailDB plugin (which is planned to replace the innobase plugin), you’ll need the latest HailDB release (which as of writing is 2.3.1). We’re making good additions to the HailDB API to enable the storage engine to have the same features as the Innobase plugin.