Storing the table message in Embedded InnoDB

One of the exciting things[1] about working on a storage engine in Drizzle is that you get to manage your own metadata. When the database engine you’re writing the storage engine interface for has a pretty complete data dictionary (e.g. Embedded InnoDB) you could just directly use it. At some point I plan to do this for the embedded_innodb engine for Drizzle so that you could just point Drizzle at an existing Embedded InnoDB database and run SQL queries on it.

The Drizzle table message does have some things in it that aren’t in the InnoDB data dictionary though (e.g. table and column comments). We want to preserve these (and also things like there may be several data types in Drizzle that map to the same data type in InnoDB). Since the Embedded InnoDB API allows us to do things within the DDL transaction (such as insert a row into a table), we store the serialized table message in a table as part of the DDL transaction. This means we can have fully crash safe DDL! There is no way the table definition can get out of sync with what is in InnoDB; we are manipulating them both in the same transaction!

The table structure we’re using is pretty simple. There is two columns: table_name VARCHAR(IB_MAX_TABLE_NAME_LEN) and message BLOB.

The operations we need are:

  • store the table message in doCreateTable (INSERT)
  • rename the table message in doRenameTable (UPDATE the table_name column)
  • delete the table message in doDropTable (DELETE)
  • list tables in a database (SELECT with prefix)
  • get table message (SELECT using key lookup)

All of which are pretty easy to implement using the Embedded InnoDB API.

[1] Maybe I need to get out more….

The Table protobuf message format

If you’ve ever opened up drizzled/message/table.proto in the Drizzle source tree you will have seen what’s in the table message: the structure that describes a database table in Drizzle. Previously I’ve talked about the Table message more generally, giving a fair bit of history of the FRM file and how we’ve replaced it with both the Table protobuf message and an infrastructure inside Drizzle so that Storage Engines own their own metadata.

Yesterday I talked about the Schema protobuf message format in more detail, and this time I’m talking about the Table protobuf message in a similar amount.

The first time we were loading (then only part of) the table definition out of a protobuf message was way back in January 2009 (I blogged about it too). It was an adventure untangling all sorts of things to get to a much nicer place (where we are now). The code in the server is not perfect… I’ll be the first to admit that some of it is rather strange, but that’s mostly all behind the scenes for people interested in the protobuf Table message!

The Table message has several embedded messages in it too. We need to have information on the Storage Engine, Fields and Indexes (and each of those can have other properties). It is much more complex than the simple Schema message.

Let’s have a look at the basic structure of the Table message:

message Table {

    /* *SNIP* (Here goes the definitions for TableType, StorageEngine, Field, Index, ForeignKeyConstrain, TableOptions and TableStats) */

  required string name = 1;
  required TableType type = 5;
  required StorageEngine engine = 2;
  repeated Field field = 3;
  repeated Index indexes = 4;

  repeated ForeignKeyConstraint fk_constraint = 8;
  optional TableOptions options = 9;
  optional TableStats stats = 10;

(We’ve skipped the definitions for the embedded messages for now)

This seems all pretty logical; a table has a name, a type, is in a Storage Engine, has Fields, may have Indexes, may have foreign key constraints, it has some options and statistics (the statistics may go away at some point “soon”).

Let’s have a look at the TableType message definition:

  enum TableType {
    STANDARD = 0;
    TEMPORARY = 1;
    INTERNAL = 2;

It’s pretty simple, the table type is either a standard table (what you get from CREATE TABLE), a temporary table (what you get from CREATE TEMPORARY TABLE) or an INTERNAL table (what you get when Drizzle uses a temporary table during query execution).

Next, the StorageEngine message:

  message StorageEngine {

    message EngineOption {
      enum EngineOptionType {
        BOOL = 0;
        INTEGER = 1;
        STRING = 2;

      required string option_name = 1;
      required string option_value = 2;
      required EngineOptionType option_type = 3;

    required string name = 1;
    repeated EngineOption option = 2;

The main part is the “name” member, which is just the name of the storage engine (e.g. “PBXT”,  “INNODB”, “ARCHIVE”). We do however have support specified in the StorageEngine message for engine specific options (in key value form). Expect these to be used more in the near future.

Specifying Fields is probably the most complex part of the table message. The Field message looks like this (with many embedded messages):

message Field {
    required string name = 1;
    required FieldType type = 2;
    optional FieldFormatType format = 3;
    optional FieldOptions options = 4;
    optional FieldConstraints constraints = 5;
    optional NumericFieldOptions numeric_options = 6;
    optional StringFieldOptions string_options = 7;

    optional string comment = 16; /* Reserve 0-15 for frequently accessed attributes */
    optional SetFieldOptions set_options = 17;
    optional TimestampFieldOptions timestamp_options = 18;

So… what does this all mean? Well, Fields have a type, they’re stored in a format, there’s options attached to them, there may be constraints as well as field type specific options.

The different field types should be fairly familiar by now:

    enum FieldType {
      DOUBLE = 0;
      VARCHAR = 1;
      BLOB = 2;
      ENUM = 3;
      INTEGER = 4;
      BIGINT = 5;
      DECIMAL = 6;
      DATE = 7;
      TIME = 8;
      TIMESTAMP = 9;
      DATETIME = 10;

We also allow fields in different formats. Currently, these are default, fixed and dynamic. The idea is you can tell the engine (or the engine can tell you) how it’s storing the field. This is currently here as a nicety and the users for this are few and far between.

    enum FieldFormatType {
      DefaultFormat= 0;
      FixedFormat= 1;
      DynamicFormat= 2;

The FieldOptions get interesting though:

    message FieldOptions {
      optional string default_value = 1;
      optional string update_value = 2;
      optional bool default_null = 3 [default = false];
      optional bytes default_bin_value = 4;

You’ll no doubt be intrigued by the existence of both “default_value” and “default_bin_value”. Ordinarily, using a string to contain a textual representation of the default value (e.g. “foo” or “42”) is fine. However, for BLOB columns, you can have defaults that aren’t representable in a text string, you need binary data (e.g. the default value contains ‘\0’).

For TIMESTAMP columns, we continue to support DEFAULT NOW() and the ability to update the timestamp column on UPDATE. How is this represented in the table message? Well… default_value will be “NOW()” and update_value will be “NOW()”. It is intended that in the future it will be possible to have arbitrary SQL expressions for these. This does, of course, require support in the Drizzle server.

The default_null bool should be rather obvious :)

Well… that’s enough for today. Next time: more of the Field message!