Ghosts of MySQL Past: Part 3

See Part 1 and Part 2.

We rejoin our story with a lawsuit. While MySQL suing Progress NuSphere is not perhaps the first GPL lawsuit that comes to mind, it was the first time that the GPL was tested in court. Basically, the GEMINI storage engine was a proprietary storage engine bundled with a copy of MySQL. Guess what? The GPL was found to be valid and GEMINI was eventually GPLed, and it didn’t really go anywhere after that. Why? Probably some business reasons and also, InnoDB was actually rather good and there wasn’t a lawsuit to enforce the GPL there, making business relationships remarkably easier.

In 2003 there was a second round of VC funding. The development team increased in size. One thing that MySQL AB did was invest heavily in technology. I think this is what gave the company a lot of value, you need to spend money developing technology if you wish to be seen as a giant and if you wish to be able to provide a high level of quality service to customers.

MySQL 4.0 went GA in March 2003 while at the same time there were 4.1 and 5.0 development trees. Three concurrent development trees may seem too many – and of course, it was. But these were heady days of working on features that MySQL was missing and ever wanting to gain users and market share. Would all these extra features be able to be added to MySQL? Time would tell…

The big news of 2003 for MySQL? A partnership with SAP. There was this idea: “run SAP on MySQL” which would push the MySQL Server in a bit of an odd direction. For a start, the bootstrap SQL script for SAP created something like 10,000 tables and loaded gigabytes of data – before you even started setting it up. In 2003, on MySQL 4.0, this didn’t go so well. Why was SAP interested? Well, then you’d be able to run SAP without paying Oracle licenses!

Past, Present and future of MySQL and variants Part 1: Ghosts of MySQL Past

You can watch the video of my linux.conf.au 2014 talk here: http://mirror.linux.org.au/linux.conf.au/2014/Wednesday/28-Past_Present_and_future_of_MySQL_and_variants_-_Stewart_Smith.mp4

But let’s talk about things in blog form rather than video form :)

Back in 1979, there was UNIREG. A text UI to records (rows) in a database (err, table). The reason I mention UNIREG is that it had FoRMs which as you may have guessed by my capitalization there is where the FRM file comes from.

In 1986, UNIREG came to UNIX. That’s right kids, the 80×24 VT100 interface to ISAM (Index Sequential Access Method – basically rows are written in insert order and indexes point to them) came to UNIX. There was no generic query language, just FoRMs and reports. In fact, to this day, that 80×24 text interface is stored in the FRM file by MySQL and never ever used (I’ve written about this before).

Then there was this mSQL thing around the 1990s, which was a small SQL server (with source) but not FOSS. Originally, Monty W plugged in his ISAM engine but it wasn’t quite the right fit… so in 1995, we had MySQL 1.0 and MySQL AB was founded.

Fast forward a bit and in 1996 we had MySQL 3.19 and development continued. It managed to gain features, performance, ports to different operating systems and CPU architectures and, of course, stability.

It wasn’t until the year 2000 that MySQL adopted the GPL. This turned out to be a huge step in the right direction for increased adoption. At the time, this was a huge risk for the company, essentially risking all the revenue of the company on making the software more free.

This was the birth of the dual licensing business model. You see, the client library (libmysql) was also GPL, which meant it was easy to use if your application was also GPL, but if you were going to distribute your application and it wasn’t under a GPL compatible license (there was also a FOSS exception so that things like PHP could use it) then you needed a license.

Revenue from licensing was to be significant throughout the entire history of MySQL AB.

(We’ll continue this in part 2 tomorrow)

What was InnoDB+?

Yes, I said InnoDB+ with a plus sign at the end (also see the first comment here).

Please note that this blog post is only based on public information. It has absolutely nothing in it that I only could have learned from back when I worked at Sun or MySQL AB. Everything has links or pointers to where you can find the information out on the Internet and all thoughts are based on stringing these things together.

There was a lot of talk around the acquisition of Sun Microsystems by Oracle about MySQL (MySQL AB was bought by Sun). Some of the talk centred around Oracle and their ability to make a closed source version of MySQL with added bits that wouldn’t be released as GPL. They’ve since proved that they’re quite willing to do this to an open source project (see OpenSolaris).

Relatively recently, a bunch of history from the old InnoDB SVN trees was imported into the MySQL source tree. You can pull the revision of the SVN tree as of InnoDB Plugin 1.0.6 release by using revid:svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/zip:6263  from the MySQL repository – or just use a branch I’ve put up on launchpad for it (lp:~stewart/haildb/innodb-1.0.6-from-svn).

The first revision from the SVN tree was created on 2005-10-27, which you may remember was not too long after Oracle acquired Innobase on the 7th of October that year. The next two revisions were importing the 5.0 innodb code base, and then the 5.1 code base. Previous history can be found according to this blog post on Transactions on InnoDB.

According to Monty in the comment on the Pythian blog:

Oracle did work on a closed source version of InnoDB, codename InnoDB+, but they never released it, probably because our contract with them stopped them.

and from Eben Moglen’s letter to the EU Commission (via Baron Schwartz’s blog post):

Innobase could therefore have provided an enhanced version of InnoDB, like Oracle’s current InnoDB+, under non-GPL license

Most tellingly is a lot of references in the revision history to “branches/innodb+” as well as this commit:

revno: 0.5.148
revision-id: svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:6329
parent: svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:6322
committer: vasil
timestamp: Thu 2009-12-17 11:00:17 +0000
message:
branches/innodb+: change name and version
Change name from “InnoDB Plugin” to “InnoDB+” and
version from 1.0.5 to 1.0.0.

So, from the revision history I’ve managed to work out that it likely was going to have the following features:

  • innodb_change_buffering (for values other than inserts)
    See revid:svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/zip:4061
    Or, more tellingly revid:svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:4053
    The latter tells about the merge of change buffering for delete-mark and delete in addition to the default of inserts.
  • Possibly compressed tables.
    revid:svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:2316 seems to show that it may have been copied across: “branches/innodb+: Copy from branches/zip r2315” in the comment.  There’s a lot of other merges of branches/zip as well
  • Something named FTS
    There is “branches/fts” in revid:svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:2325 and revid:svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:2324  (there’s an import of a red-black tree implementation)
    If you also look at revid: svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:6776
    you’ll see references to a innofts+ branch with ha_innodb.cc in it.
    So between a red-black tree and handler changes, this is surely something interesting.
  • Persistent statistics (also revid: svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:6776)
  • Metrics Table (also revid: svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:6776)
  • posix_fadvise() hints to temp files used in creating indexes (revid:svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:2342 )
  • Improved recovery performance
    See revid:svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:2989
    Talks about using the red-black tree for sorted insertion into the flush_list
  • native linux aio (revid:svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:3913 )
  • group commit (revid:svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:3923 )
  • New mutex to protect flush_list (revid:svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:6330)

and finally, in revid:svn-v4:16c675df-0fcb-4bc9-8058-dcc011a37293:branches/innodb%2B:6819 you can see the change from “InnoDB+” back to “InnoDB” for being the built in default for MySQL 5.5