Where are they now: MySQL Storage Engines

There was once a big hooplah about the MySQL Storage Engine Architecture and how it was easy to just slot in some other method of storage instead of the provided ones. Over the years I’ve repeatedly mentioned how this wasn’t really the case and that it was remarkably non trivial.

Over the years there have been many storage engines crop up and then disappear. So… where are they now?

  • ISAM
    This became MyISAM…. you know you’ve been around MySQL a long time if you’ve ever had to deal with an ISAM table.
  • Gemini
    This was the first big test of the GPL in court. Basically, you have to obey the GPL (see wikipedia for more info). The code was released as GPL and development stopped. This has been dead since ca 2002.
  • Amira – http://launchpad.net/amira
    Antony first mentioned this in 2008 on his blog. This was a continuation of the Gemini engine, you can actually go over to launchpad and get the code. This was one of the projects to have a transactional storage engine not owned by Oracle after Innobase Oy was acquired by them. It went nowhere special as Netfrastructure was acquired which became Falcon.
  • BDB
    otherwise known as the BerkeleyDB engine. It was seldom used and never gained much of a userbase. It was unceremoniously dropped back in 2006 and both users didn’t really exist.
  • PBXT - http://pbxt.blogspot.com/
    I think we can credit PBXT with at least half of the features and performance improvements to InnoDB since it first emerged back in 2006. It got attention very quickly. Why? Because it was different. It had the very rare ability to outperform InnoDB in some places. You can still find PBXT in MariaDB, but sadly it can be hard to fund development of a MySQL storage engine, especially one as tied to MySQL as PBXT is, and it’s no longer under active development. Closely related was the Blob Streaming project which was way ahead of its time as an AlsoSQL access method. The good news is that the code was released under a BSD license in 2012 (was previously GPL). We even had PBXT in Drizzle for a while.
  • Blob Streaming (PBMS) - http://bpbdev.blogspot.com/
    This project was closely related to (but not depending exclusively on) PBXT. It embedded a HTTP server inside the database and could use it to read and write BLOBs. This was not only fairly cool but way ahead of its time. We owe the existence of both HandlerSocket and the memcached interface to InnoDB to PBMS (it was also an inspiration for the JSON server plugin for Drizzle, to address some of the use cases of the PBMS plugin).
  • Federated
    It’s still there… but is effectively unmaintained and dead. There’s even FederatedX in MariaDB which is an improvement, but still, the MySQL server really doesn’t lend itself kindly to this type of engine… it’s always been an oddity only suitable for very specific tasks.
  • Archive
    Although useful, effectively unmaintained. I kinda don’t want to say dead… but if it went away, I wouldn’t exactly be surprised.
  • CSV
    Currently used to access the log tables in MySQL… and hardly used otherwise. It’s odd that the same code doesn’t deal with SELECT INTO OUTFILE and LOAD DATA INFILE, and I doubt this will ever change. I’d say effectively niche/dead.
  • SolidDB
    Purchased by IBM, abandoned.
  • DB2
    Only ever on System i. Useful for very very few people… but you can still find it around if you’re one of them.
  • Infobright
    OMG it exists! This is probably because they’re largely just using the MySQL server as a way to implement the MySQL network protocol and all of the heavy lifting is done by their own code.
  • Xeround
    I’m quite surprised these guys are still around, as they’re a proprietary storage engine as a service, and initial testing wasn’t entirely promising.
  • TokuDB
    I cannot emphasize how much more interesting TokuDB would be if it were open source. It actually holds some promise… and with their recent work with mongo, perhaps this is a good way forward for them…
  • Maria/Aria
    Another “OMG Oracle just bought Innobase Oy” engine. This was a project to take MyISAM and turn it into a lean, mean, transactional storage engine machine. It’s still not there and I don’t think it ever will be.
  • Falcon
    This was the hot new thing. It came out of Netfrastructure, which MySQL AB acquired in order to help get a transactional storage engine after Innobase Oy was acquired by Oracle. If you’re keeping count, that’s three projects for a transactional storage engine. Falcon was the star though, receiving all the press and publicity (well before it was ready). There are many reasons why Falcon isn’t around today – the chief one probably being that Oracle bought Sun who had bought MySQL and thus a need for an “InnoDB replacement” instantly vanished. There was also immense management pressure for performance to be greater than InnoDB, without any allowance for or focus on correctness…. and this showed. This was quite disappointing as Falcon had a lot of good architectural things going for it.
  • BlitzDB - https://launchpad.net/blitzdb
    I had hoped we’d replace MyISAM with BlitzDB in Drizzle. It was a wrapper around Tokyo Cabinet to the storage engine API in Drizzle. Unfortunately, the ties to MyISAM are incredibly deep (see my recent post on internal temporary tables) and we never quite got there.

I think this is all the notable engines that were aimed at widespread adoption… what ones have I forgotten?

It’s interesting to note that only Archive, CSV, Xeround, TokuDB and Infobright can be gotten anywhere, and the latter two only in their own distribution (one proprietary) and Xeround only as a service.

51 thoughts on “Where are they now: MySQL Storage Engines

  1. pretty much. I wonder how long that’ll last though – the InnoDB memcached plugin could have used the storage engine interface after all…

  2. You forgot KFDB, the storage engine used by Kickfire. Some Kickfire appliances were shipped to customers, so it isn’t a phantom storage engine. Kickfire was purchased by Teradata and the appliance business was terminated.

  3. Oh yeah… there was Kickfire… although that was all dependent on specific hardware, so it’s kinda not mass-market for everybody :)

  4. Three projects to replace InnoDB. Why not a single one of them was to fork InnoDB?

  5. Bill,

    Spider just got a shiny new 3.0 release and is working on MariaDB integration. They added the ability to connect to remote Oracle tables too.

  6. MySQL AB would want to sell commercial licenses and they’d only have GPL license to InnoDB code, so they wouldn’t have been able to.

    That being said, the whole libmysqld thing was always a mess… the *CORRECT* way would bet to have libmysql be able to fork() and exec() a mysqld….

  7. I assume you intentionally omitted NDBCLUSTER since it is still going strong. MyISAM_MERGE engine could have been listed though as all of its functionality has been replaced by partitioning.

  8. You are becoming the MySQL historian :-) I like it!

    BDB was also acquired by Oracle about the same time as they bought InnoDB. Makes sense as with that Oracle bought both of the existing transactional engines.

    PBXT was secretly sponsored by MySQL/Sun, so really it was the secret fourth InnoDB replacement project. They say Sun would have acquired it if Oracle hadn’t bought Sun the day before. An interesting parallel universe to think about!

    The list is missing InfiniDB, another columnar storage engine. They have a poor approach to handling communications with the community, so no wonder you forget them.

    Also missing several one developer engines like S3 engine, wormhole engine.

  9. Oh, one more thing: I seem to remember that DB2 engine is still supported by Zend, and Percona helps Zend do that. You should know!

  10. The only experience I’ve got with Falcon engine is to see it core dumping as soon as I’ve started any of benchmark test..

    Rgds,
    -Dimitri

  11. There were more — NitroDB, for example.

    Don’t even mention all the community engines for some-specific-task-you-don’t-really-need-but-8-people-in-the-world-need.

  12. You did not mention MEMORY, which had couple of local hybrids.

    You forgot also BlackHole, which is relatively widely used.

    You are as forgetful as when you were with us …. ;-)

  13. Pingback: The MERGE storage engine: not dead, just resting…. or forgotten. | Ramblings

  14. Pingback: The new CONNECT Storage Engine with MariaDB 10.0.2 « Serge Frezefond 's blog

  15. Pingback: TokuDB | Ramblings

  16. Pingback: MariaDB CONNECT Storage Engine vs FEDERATED(X) « Serge Frezefond 's blog

  17. I have been compiling them for some time to the spanish wikipedia. I found thus far:

    2.1 Archive
    2.2 Aria
    2.3 AWSS3
    2.4 BDB
    2.5 Blackhole
    2.6 Cassandra SE
    2.7 ClouSE
    2.8 Connect
    2.9 DDE-GAN
    2.10 CSV
    2.11 Example
    2.12 Federated
    2.13 Federated/X
    2.14 IBMDB2I
    2.15 InfiniDB
    2.16 Infobright
    2.17 InnoDB
    2.18 Mdbtools
    2.19 MemcacheDB
    2.20 Memory
    2.21 Merge
    2.22 Mroonga
    2.23 MyBS
    2.24 MyISAM
    2.25 NDB
    2.26 OQGraph
    2.27 PBXT
    2.28 Q4M
    2.29 RitmarkFS
    2.30 ScaleDB
    2.31 SphinxSE
    2.32 Spider
    2.33 TokuDB
    2.34 XtraDB

    Some are discontinued, some exotic and some mainstream…

  18. Re Archive, it was based on a false premise – namely that disk I/O was the bottleneck. I proved with my insert test tool that the actual bottleneck is CPU first – even when you clean up the parser overhead compared to 5.0 (as Mark Callaghan) did.
    So that makes ARCHIVE irrelevant in terms of writes.
    Of course the data will take up less diskspace which for very large datasets might be relevant, and tablescans for reads are faster than MyISAM in that tablesize range (because there’s less disk I/O).

    In a nutshell, once you take out the false “reduce write I/O” hypothesis, the actual use case is very limited.

  19. Pingback: The MySQL Cluster storage engine | Ramblings

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.