innodb and memcached

I had a quick look at the source tree (I haven’t compiled it, just read the source – that’s what I do. I challenge any C/C++ compiler to keep up with my brain!) that’s got a tarball up on labs.mysql.com for the memcached interface to innodb. A few quick thoughts:

  • Where’s the Bazaar tree on launchpad? I hate pulling tarballs, following the dev tree is much more interesting from a tech perspective (especially for early development releases). I note that the NDB memcached stuff is up on launchpad now, so yay there. I would love it if the InnoDB team in general was much more open with development, especially with having source trees up on launchpad.
  • It embeds a copy of the memcached server engines branch into the MySQL tree. This is probably the correct way to go. There is no real sense in re-implementing the protocol and network stack (this is about half what memcached is anyway).
  • The copy of the memcached engine branch seems to be a few months old.
  • The current documentation appears to be the source code.
  • The innodb_memcached plugin embeds a memcached server using an API to InnoDB inside the MySQL server process (basically so it can access the same instance of InnoDB as a running MySQL server).
  • There’s a bit of something that kind-of looks similar to the Embedded InnoDB (now HailDB) API being used to link InnoDB and memcached together. I can understand why they didn’t go through the MySQL handler interface… this would be bracing to say the least to get correct. InnoDB APIs, much more likely to have fewer bugs.
  • If this accepted JSON and spat it back out… how fast would MongoDB die? weeks? months?
  • The above dot point would be a lot more likely if adding a column to an InnoDB table didn’t involve epic amounts of IO.
  • I’ve been wanting a good memcached protocol inside Drizzle, we have ,of course, focused on stability of what we do have first. That being said…. upgrade my flight home so I can open a laptop… probably be done well before I land….. (assuming I don’t get to it in the 15 other awesome things I want to hack on this week)

Drizzle online backup with xtrabackup

For backups, historically in the MySQL world you’ve had mysqldump (a SQL dump, means on restore you have to rebuild indexes), InnoDB Hot Backup (proprietary, but takes a copy of the InnoDB data files, so restore is much quicker), LVM snapshots (various scripts exist, does have larger IO impact, requires LVM) and more recently xtrabackup. Xtrabackup essentially does the same thing as InnoDB hot backup except that it’s free and open source software.

Many people have been using xtrabackup successfully for quite a while now.

In Drizzle7, our default storage engine is InnoDB. There have been a few changes, but it is totally InnoDB. This leaves us with the question of backup solutions. We have drizzledump (the Drizzle equivalent to MySQL dump – although with fewer gotchas), you could always use LVM snapshots and the probability of Oracle releasing InnoDB Hot Backup for Drizzle is rather minimal.

So enter xtrabackup as a possible solution… I had though of porting xtrabackup across for a while. Last weekend, while waiting for one of my iterations of catalog support to compile, I decided to give it a go. I wanted to see how far I could get with it also in that weekend.

I was successful – there’s a tree up at lp:~stewart/drizzle/xtrabackup thatproduces an xtrabackup binary that’s built for Drizzle (it’s not quite ready for merging yet, there are some obivous bugs around command line option parsing… but a backup and restore did work).

I wanted the following:

  • build to be integrated with Drizzle, using the same innobase build that we use to build the server
  • build with strict compiler warnings and -Werror (which we do forDrizzle)
  • build with a C++ compiler (as we do with innobase in Drizzle)
  • not re-add parts of mysys into the Drizzle build just for xtrabackup

I’ve already submitted merge requests to upstream xtrabackup containing the compiler fixes and added compiler warnings (they’ve also by now been merged into xtrabackup). Already my work has improved the quality of xtrabackup for everyone. Some of the warnings were fixed slightly differently in xtrabackup than in my Drizzle tree, but I plan to merge.

One issue was that the command line parsing library that xtrabackup uses – my_getopt which is part of mysys (the portability library inside MySQL) is long since gone from Drizzle. We currently use Boost::program_options. Thanks to the heroic efforts of Andrew Hutchings, xtrabackp in Drizzle is also using boost::program_options. This was a brilliant “hey, can you have a look at this conversion” followed by handing him a tree that did not even remotely compile, followed by a “I have to take the kids somewhere, here’s a tree – it may compile”. Amazingly enough, it pretty much did compile once I fixed the other issues.

An unresolved issue is how to deal with this going forward – my guess is that upstream xtrabackup doesn’t want to require Boost.

One solution could be just to factor out command line options into a sepfile that we can ignore for Drizzle and replace with our own. The other option could be to use a differnt command line option parsing library (perhaps from CCAN, as it’s then maintained by somebody else and doesn’t require heaps and heaps of other stuff).

Another issue I had to tackle is the patch to innobase that’s required to build xtrabackup.

I took a very minimal approach for the Drizzle patch. We are currently based on innobase 1.1.4 from MySQL 5.5 – so I mostly looked at the xtradb55 patch. I think it would be great if these were instead of one giant patch a series of patches to apply (a-la quilt) to a) make iteasier maintain and b) easier for myself to work out the exact reasoning of each bit (also, generating the patches with -p would help a fair bit too).

So how did I do it?

Step 0
was removing support for old innobase – we totally don’t need it for Drizzle.

Step 1
was creating a srv_read_only option for Drizzle’s innobase. This was fairly easy. The one thing I did have to change was adding a checkin os_file_lock() so that we don’t attempt to write lock the ibdatafiles when in read only (otherwise backups can’t be taken while drizzledis running). I’m a little surprised that this wasn’t hit in 5.5 at all.

Step 2
was implementing srv_fake_write. I’m pretty sure I’ve gotten this right in the Drizzle implementation, but the patch wasn’t as easy toread as I’d really like. I probably need to do a bit more of a code audit that this is actually correct (I may try and come up with anLD_PRELOAD library that will scream loudly if writes are made to files matching a pattern).

Step 3
was implemnting srv_apply_log_only. Pretty sure I have this right, again, more testing will be required. Why? Because I’m that paranoid about getting things very, very right.

Step 4
was to go through all the functions that xtrabackup needed to not be static. Instead of having prototypes for them inxtrabackup.cc, I instead added a xtrabackup_api.h header to Innobase and included it where needed (including in xtrabackup). I’d recommend this way going forward for xtrabackup too as it could be a lot less problematic to maintain (and makes xtrabackup source a bit easier to read)

Step 5
was fixing up a few skeleton functions that were needed to make our innobase happy. It may not be a bad idea to split out the skeleton functions into a sep source file so it’s a bit easier to track (and some #ifdefs around those not needed for certain releases).

I’m hoping to work with the upstream xtrabackup devs on the various points I’ve made above.
Another thought of mine is to port xtrabackup into HailDB where we can use much more neat API functions to create good tests for xtrabackup.

Thanks go out to all who’ve worked on xtrabackup. It honestly wasn’t too hardgetting it ported across to Drizzle – and with a bit of collaboration I think we can make it easy to keep up to date.

What’s the future for Xtrabackup in Drizzle? It’ll likely end up being a binary named drizzlebackup-innobase or similar (this means that there is a clear difference between xtrabackup for MySQL and what we have in Drizzle – which is more accurately defined as based on xtrabackup). We’ll also probably want a nice wrapper or integration with a backup tool to deal with everything Drizzle related. We shall also introduce a lot of testing; backups are important.

Xtrabackup is topical, check out the latest OurSQL podcast and the the Percona Xtrabackup website for more info!

Multi-tenancy Drizzle

My previous post focused on some of the problems of doing multi-tenant MySQL.

One of the reasons why I started hacking on Drizzle was that the multi-tenancy options for MySQL just weren’t very good (this is also the reason why I run my blog in a VM and not a shared hosting solution).

What you really want is to be able to give your users access to a virtual database server. What you don’t want is to be administering a separate database server for each of your users. What you want are CATALOGs.

A CATALOG is a collection of SCHEMAs (which have TABLEs in them). Each CATALOG is isolated from all the others. Once you connect to a catalog, that’s it. They are entirely separate units. There are no cross-catalog queries or CHANGE CATALOG commands. It is as if each catalog is its own database server.

You can easily imagine a world where there are per-catalog resource limits too (max connections, max temp tables etc).

For the Drizzle7 release, we got in some preliminary support to ensure that the upgrade path would be easy. You’ll notice that all the schemas you create are in the ‘local’ catalog (you can even spot this on the file system in the datadir).

For the next Drizzle milestone, the intent has been to flesh out this support to enable a very elegant multi-tenant solution for Drizzle.

One of the things we worked on a little while ago now is introducing TableIdentifier and SchemaIdentifier objects into Drizzle. Historically (and still in the MySQL codebase) tables would be referenced by a string in the format “database/table_name” (except sometimes when it could be “database\table_name”). There were also various times when this was the name as entered by the user and times when this was converted into a safe form for storing on disk (and comparing to one another).

Everywhere in Drizzle where we have to deal with the path to a table, we call a method on a TableIdentifier for it. There is also a method for getting a series of bytes that can be used as a key in a data structure (e.g. table definition cache). It’s even used to work out what path on the file system to store the data file in.

I would say that the use of TableIdentifier and SchemaIdentifier has prevented many bugs from creeping into the server during development. It took several aborted goes to get something like this into the codebase, just because how the database name, table name and table path strings were being used and abused around the server.

So, with our cleaned up codebase using TableIdentifier and SchemaIdentifier, how hard would it really be to introduce the new level of CATALOG?

So along comes an epic hacking session last weekend. Basically I wanted to see how close I could get in such a short time. I introduced a CatalogIdentifier and then forced the SchemaIdentifier constructor to require one as a parameter….

The great benefit? Every place in the code that I needed to change was a compile error. The overwhelming majority of places I had to fix were shown to me by GCC. I cannot express how wonderful this is.

Anyway, by going through all of these places and fixing up a few things that also needed fixing (including just throwing in a few temporary hacks), I got it working!

I have a tree on launchpad (lp:~stewart/drizzle/multitenant) where you can create multiple catalogs, connect to any of the catalogs you’ve created (or local), create schemas and tables in them. Each catalog is separate. So you can have 1 server with 2 schemas called ‘test’ and 2 tables called ‘t1’.

This is a pretty early development tree… and it comes with some limitations (that will need to be fixed):

  • “local” is the default catalog (as it is in Drizzle7)
  • “local” must exist
  • I_S and DATA_DICTIONARY only show up in local (i.e. SHOW TABLES doesn’t even work yet)
  • HailDB is “local” only
  • testing equals approximately zero
  • currently the only protocol plugin that supports connecting to another catalog is the console plugin

But, as a result of only a weekend of hacking, pretty cool.

Paving the way for easy Database-As-Aa-Service.

The problems with multi-tenant MySQL

Just about every web host in the world gives people “mysql databases”. Usually one. Why? because that’s how permissions work in MySQL and it’s a relatively simple way to set it up (each of your web host clients is a user with access to their one database).

This has a lot of limitations for the end user:

  • You also only get one user (that can do anything). This means you can forget enhancing security of your web app by not letting the front end DROP TABLE (or DROP SCHEMA). This is probably one of the main reasons that so few web apps use any of the access controls available in MySQL.
  • You only get one database. Typically this means you get to run one application. Or, you start to get applications that allow you to specify a table name prefix as a lovely hack around this kind of situation (ick).
  • If you get more than one database, it’s going to be all prefixed with your username – and it’s going to be via a web UI to create/drop them – not via the normal way.
  • You’ll never get replication.
  • If you need to scale – it’s migrating hosting solutions, there isn’t an easy to use “now i need read-only slaves” button.
  • Backups are only ever going to be SQL dumps from the master (if you don’t completely trust your hosting provider)
  • If your provider does do replication, one other bad user in the system could introduce EPIC amounts of replication latency.

One solution is to give everybody their own MySQL instance. Ick. Why ick? Well… you now have a lot more MySQL servers to administer. That’s not the problem however: you’ve just screwed yourself on IOPs – each mysqld gets to compete to sync its data to disk.

The next step people go to is running MySQL inside a virtual machine. You are again screwing yourself on IOPs. Every VM on the box can now fight each other for a limited number of sync operations to make the data safe on disk. Forget if group_commit works for your MySQL version, having many VMs running MySQL on the same physical box will screw you much more than lack of group_commit ever will. You can probably kiss consistent performance and latency goodbye too (this will largely depend on how VMs are being run by your hosting provider).

The best way to get screwed is to get “free” extra CPU cycles and IOPs that are excess on the physical machine and then to suddenly switch to not getting any “free” ones and instead only the ones you pay for… wonder why your site is suddenly slow to respond where the number of visitors is the same and you’ve changed NOTHING?

Even running MySQL inside a VM that is the only VM on the box has a performance impact. You want to be using each physical machine to its fullest. If you’ve got a bunch of MySQL servers running inside VMs – you are not doing that.

(you can substitute just about any other database server for “MySQL” in all of the above… interestingly enough I have been told that a certain proprietary database server has a very low performance drop when run inside a VM)

Drizzle7

We’ve released Drizzle7! Not only that, we’re now calling it Generally Available – a GA release.

What does this mean? What does this GA label mean?

You could view as a GA label being “we’re pretty confident people aren’t going to on mass ask for our heads when they start using it”… which isn’t a too bad description. We also plan to maintain it, there could be future releases in this series that just include bug fixes – we won’t just immediately tell you to go and use the latest tarball or bzr tree. This release series is a good one to use.

Drizzle7 is something that can be packaged in Linux distros. It’s no longer something where the best bet is to add the PPA and upgrade every two weeks or build from source yourself. If you’re looking to deploy Drizzle (or develop against it) – you can rely on this release.

I’ll never use the words “production ready” to describe a release – it’s never up to me. It’s up to each person or organisation looking to deploy a piece of software to decide if that bit of software is production ready for them.

Personally, I’m looking forward to see how people can break it. While Drizzle is the best tested FOSS SQL RDBMS server, I’m sure there’s new an interesting ways it can be broken by saying we’re ready for a much larger crowd to hammer on it.

Overall, I think we’ve managed to take the now defunct MySQL 6.0 tree (way back in 2008) and release something that can truly live up to the line “database for cloud”. Drizzle is modern, modular, rather solid and understandable. The future is bright, there is so much more to do to make the ultimate database for cloud. Drizzle7 is a great platform to build on – both for us (developers) and us (people who use relational databases).

Fixed in Drizzle: No more “GOTCHA’s”

 

O'Reilly MySQL Conference & Expo 2011

At the upcoming MySQL Conference and Expo, I’m going to give a Thursday afternoon (2pm) session entitled Fixed in Drizzle: No more “GOTCHA’s”. I plan to have a lot of fun with this session..

If you go back to the very start of when I started submitting code to Drizzle (June 2008) – I was going and fixing some of my favourite “gotcha’s” inside the code: BUILD/ scripts that didn’t build the way releases would, wrappers on POSIX functions with different (and inconsistent) semantics, NETWARE support, a non thread safe client lib, my_errno (different to errno) etc. I won’t really be talking about internals like this – it may give me a happy but really isn’t the latest awesome in databases.

I’ll instead be going over the way more awesome user and DBA visible things we’ve fixed/added/removed from Drizzle that make it a database with as few GOTCHA’s as possible.

Authentication (pluggable, LDAP), Logging (to syslog, gearman), DATA_DICTIONARY, INFORMATION_SCHEMA, engines owning their own metadata, STRICT mode by default, removing global mutexes, improving the Storage Engine API, improving the replication log, including code such as PBXT and PBMS Blob Streaming, filesystem engine (read files from disk like a table), pluggable protocol, UTF8 by default, ENUM data type, auto_increment behaviour.

All this and more is “Fixed in Drizzle”.

(oh, and there’s no 24bit integer or a BLOB that can only be 255 bytes)

Undocumented ALTER TABLE that does *nothing* (useful)

(at least since MySQL 5.1.42)

alter table t1 force;

Pretty neat huh? In fact, in Drizzle this will end up doing a copying alter table. Not useful.

There’s an over four year old bug report in MySQL (Bug#24091).

I’m just going to remove that bit from the parser in Drizzle – it makes no sense.

SQL Oddity: ALTER TABLE and default values

So, the MySQL (and Drizzle) ALTER TABLE syntax allows you to easily change the default value of a column. For example:

CREATE TABLE t1 (answer int);
ALTER TABLE t1 ALTER answer SET DEFAULT 42;

So, you create a TIMESTAMP column and forgot to set the default value to CURRENT_TIMESTAMP. Easy, just ALTER TABLE:

create table t1 (a timestamp);
alter table t1 alter a set default CURRENT_TIMESTAMP;

(This is left as another exercise for the reader as to what this will do – again, maybe not what you expect)

ALTER TABLE RENAME RENAME RENAME

Here’s a nice challenge for you. What does the following do (or error out on?):

CREATE TABLE t1 (a int);
CREATE TABLE t2 (b int);
ALTER TABLE t1 RENAME t3, RENAME t2, RENAME t4;

I’d be interested to know what a) you think it does and then b) if you were surprised when you went and typed it into your RDBMS of choice.

Timing queries in the 21st century (with LD_PRELOAD and sed)

So… Baron blogged about wanting higher precision timers from the mysql binary and that running sed on the binary wasn’t cutting it. However… I am not one to give up that easily!

This is what LD_PRELOAD was made for! Evil nasty hacks to make your life easier!

By looking at the mysql.cc source code, I can easily work out how this works… I just have to override two calls! They being sysconf() (we fake how many ticks per second there are) and times() (let’s return a much higher precision number).

Combined with the sed hack on the binary to change the sprintf call to print out the higher precision number, we have:

mysql> select count(*) from t1;
+----------+
| count(*) |
+----------+
|   710720 |
+----------+
1 row in set (1.080110 sec)

Get it from my junkcode: http://www.flamingspork.com/junkcode/high_precision_mysql_timer.c

(or you can get it from http://paste.drizzle.org/show/364/)

Implicit COMMIT considered harmful.

If you execute the following, what does your RDBMS do?

CREATE TABLE t1 (a int);
START TRANSACTION;
INSERT INTO t1 (a) VALUES (1);
START TRANSACTION;
INSERT INTO t1 (a) VALUES (2);
ROLLBACK;
SELECT * FROM t1;

The answer may surprise you.

No implicit commit (on the road to transactional DDL)

A long time ago, in a time that can only serve to make some feel old and others older, MySQL didn’t support transactions. Each statement was executed as it went, there was no ROLLBACK (or COMMIT or crash recovery etc). Then there were transactions. Other RDBMSs implement auto_commit functionality, but for MySQL users, we think of it as the magic compatibility mode that (mostly) makes applications written for MyISAM magically work on InnoDB (okay, and making “you should use transactions” a really easy consulting gig :)

I’m currently working on finishing up a patch that removes the implicit COMMIT from DDL operations in Drizzle. Instead, you get an error message saying that Transactional DDL is not currently supported. I see a future where we have one of two situations (possibly depending on the storage engine): support DDL within normal transactions, DDL only transactions (cannot mix with DML). The latter (DDL only transactions) I see as the option for InnoDB/HailDB.

Is your Storage Engine buggy or the database server?

If your storage engine returns an error from rnd_init (or doStartTableScan as it’s named in Drizzle) and does not save this error and return it in any subsequent calls to rnd_next, your engine is buggy. Namely it is buggy in that a) an error may not be reported back to the user and b) everything may explode horribly when rnd_next is called after rnd_init returned an error.

Unless it is running on MariaDB 5.2 or (soon, when the patch hits the tree) Drizzle.

Monty (Widenius, not Taylor) wrote a patch for MariaDB based on my bug report that addressed that problem. It uses the compiler feature to throw a warning if the result of a function isn’t checked to make sure that all places that call rnd_init are checking for an error from the engine.

Today I (finally) pulled that into Drizzle as well.

So… if your engine does the logical thing and goes “oh look, this method returns an error… I’ll return my error” it will exhibit bugs in MySQL but not MariaDB 5.2 or Drizzle (when patch hits).

Which is buggy, the server or the engine?

The MySQL bug number is 54166, filed in June 2010.

MySQL 5.5 is GA and 5.5.8 missing from launchpad…

While it’s great that MySQL 5.5 is GA with the 5.5.8 release (you can download it here), I’m rather disappointed that the bzr repositories on launchpad aren’t being kept up to date. At time of writing, it looked like this:

Yep – nothing for five weeks in the 5.5 repo – nothing since the 5.5.7 release :(

There hasn’t been zero changes either – the changelog has a decent number of fixes.

Persistent index statistics for InnoDB

In browsing the BZR tree for lp:mysql-server, I noticed some rather exciting code had been merged into the Innobase code.

You may be aware that InnoDB will do some index dives when opening a table to get some statistics about the indexes that can help the optimiser make good query plans.

The problem being that this is many disk seeks. It means that on server restart, you have to spend a whole bunch of time seeking around the disk reading index pages.

Not any more.

There is now code merged in to store the calculated statistics in a table inside InnoDB so that these index dives don’t have to happen on startup.

Originally, this looked like it was going to make it into InnoDB+. The good news is that it’s now in a public source tree. I look forward to when it hits a stable release.

(hopefully somebody other than me can beat me to it and write a nice description of the algorithms involved… the code is pretty easy to follow, so it shouldn’t be hard)

Replication log inside InnoDB

The MySQL replication system has always had the replication log (“binlog”) as a separate set of files on disk. Originally, this really didn’t matter as, well, MyISAM wasn’t transactional or crash safe so the binlog didn’t need to be either. If you crashed on a busy write workload, your replication was just going to be hosed anyway.

So then came a time where everybody used InnoDB. Transactional, crash-safe and all the goodies. Then, a bit later, came storing master rpl log position in InnoDB log and XA with the binlog. So a rather long time after MySQL first had replication, you could pull the power cord on the master with a decent amount of certainty that things would be okay when you turned it on again.

I am, of course, totally ignoring the slave state and if it’s safe to do that on slaves.

Using XA to make the binlog and InnoDB consistent does have a cost. That cost is fsync()s. You have to do a lot more of them (two phase commit here).

As you may be aware, at a (much) earlier point in Drizzle we completely ripped out the replication code. Why? A lot of it was very much still geared to support statement based replication – something we certainly didn’t want to support. We also did not really want to keep the legacy binlog format. We wanted it to be very, very pluggable.

So the initial implementation is a transaction log file. Basically, we write out the replication messages to a file. A slave reads this and applies the operations. Pretty simple and foolproof to implement.

But it’s pluggable.

What if we stored the transaction log inside innodb? Not only that, what if we wrote it as part of the transaction that was doing the changes? That way, no XA is needed – everything is consistent with a COMMIT. This would greatly reduce the number of fsync()s needed to be consistent.

Now… the first thing people will say is “arrggh! You’re writing the data *four* times now”. First being the txn data into the log, then the replication log into the log, and then both of these are written back to the data file. It turns out that this is much cheaper than doing the additional fsync()s.

In one of our tests, the file based transaction log: ~300tps. Transaction log in InnoDB: ~1200tps.

I think that’s an acceptable trade-off.

We’ve just merged the first bit of this work into Drizzle.

Props go to Joe Daly, Brian and myself for making it work.

A more complete look at Storage Engine API

Okay… So I’ve blogged many times before about the Storage Engine API in Drizzle. This API is somewhat inherited from MySQL. We have very much attempted to make it a much cleaner interface. Our goals in making changes include: make it much easier to write and maintain a storage engine, make the upper layer code obviously correct and clear in what it’s doing and being able to more easily introduce optimisations.

I’ve recently added a Storage Engine that is only used in testing: storage_engine_api_tester. I’ve blogged on it producing call graphs (really state transition graphs) before both for Storage Engine and Cursor.

I’ve been expanding the test. My test engine is now a wrapper around a real engine instead of just a fake one. This lets us run real queries (and test cases) while testing what’s going on. At some point in the near future I plan to make it so that it will be able to log what calls go on to the engine and produce a graph just of those.

I added a lot more to the Storage Engine part of the wrapper. Below is what you can see is the current graph:

I’ve coded what I consider to be bugs as red and what I consider suspect as blue.

Also for the Cursor (colours mean the same):

As you can see, there’s currently some wacky possibilities. I’m investigating exactly what’s going on here – If I’m somehow missing some calls that I should be wrapping (I don’t think so) or if we are really doing some dumb-ass things in the upper layer.

Also, please do not be under any impression that any of this means that we’re going to have a stable API. We’re not. To stabilise on this would just be insane – way too much of it still makes not much sense.

Cursor states

Following on from my post yesterday on the various states of a Storage Engine, I said I’d have a go with the Cursor object too. A Cursor is used by the Drizzle kernel to get and set data in a table. There can be more than one cursor open at once, and more than one per thread. If your engine cannot cope with this, it is its responsibility to figure it out and return the appropriate errors.

Let’s look at a really simple operation, inserting a couple of rows and then reading them back via a full table scan.

Now, this graph is slightly incomplete as there is no doEndTableScan() call. But you can see in which order things are meant to happen. In this case, “store_lock()” means that store_lock() has been called, so when coming back from doInsertRecord() we do not call store_lock() again, rather, we’re just in a state where it has already been executed.

For MySQL handler, think ::write_row() for doInsertRecord() and ::rnd_init() for doStartTableScan().

This diagram was again auto-generated from my test engine.

Rackspace Rookie-O (in Hong Kong!)

I’d meant to finish writing this way back in July… but I failed at that. Now is a good time to talk about Rookie-O as my again new colleague Andrew Hutchings (Buy his and Sergei’s book on MySQL 5.1 Plugin Development!) just went through the same thing (but in London instead of Hong Kong) given by the same trainer (Hi Eddie!).

Rackspace is the second employer I’ve had that has some kind of new hire training (the first being Sun). I am, of course, not quite counting Salmiakki as new-hire training for MySQL (although I probably should). To quote from the Wikipedia article: “Although the rumor of the heart attack was a hoax, the drink may still cause harm. The strong flavor almost completely masks the presence of ethanol, and the drinker may not realize he is consuming a drink almost 40% alcohol by volume (80-proof), leading to possible alcohol poisoning.” A promising introduction to the company.

Monty, MÃ¥rten and Kaj with Salmiakki singing Helan GÃ¥r at the MySQL User Conference Japan in 2007

Monty, MÃ¥rten and Kaj with Salmiakki singing Helan GÃ¥r at the MySQL User Conference Japan in 2007

I could possibly say something about the Sun New-Hire training… but I’m just trying to find something positive to say – and I can’t. I got a bit of hacking done? Seriously.

Actually coordinating a time to attend a Rookie-O (Rookie Orientation, the Rackspace name for new hire training) was rather tricky. There was one right before the MySQL User Conference back in April (not the best of timing), one during an upcoming team meeting (again, not ideal) and one that got organised in the middle of everything for the office in Hong Kong. So, I headed to Hong Kong.

Hong Kong streetlife

The Hong Kong office is relatively new (late 2008) and there were people there who hadn’t gone through the standard Rackspace Rookie-O (Orientation).

Rackers walking Hong Kong at Night

It was rather cool to hang out with other people who worked for the company – and in totally different areas than I do. I did get a better understanding for how the rest of the company operates and the people involved. The training itself was useful and substantially less geared towards not-my-job than Sun’s was.

The good news is that Andrew thought it was useful too. Pretty impressed so far.