MySQL modularity, are we there yet?

MySQL is now over four times the size than it was with MySQL 3.23. This has not come in the shape of plugins.

Have we improved modularity over time? I decided to take LoC count for plugins and storage engines (in the case of Drizzle, memory, myisam and innobase are storage engines and everything else comes under plugin). I’ve excluded NDB from these numbers as it is rather massive and is pretty much still a separate thing.

Version Total LoC Plugin LoC Storage Engines LoC Remaining (kernel)
MySQL 3.23.58 371,987 0 (0%) 176,276 195,711 (52% kernel)
MySQL 5.1.68 721,331 228 237,124 483,979 (67% kernel)
MySQL 5.5.30 858,441 2,706 171,009 684,726 (79% kernel)
MySQL 5.6.10 1,049,344 29,122 236,067 784,155 (74% kernel)
MariaDB 5.5 1,142,118 11,781 304,015 826,322 (72% kernel)
Drizzle trunk 334,810 31,150 130,727 172,933 (51% kernel)

I’ve used the non-plugin and non-storage engine code size to be the database “kernel” – i.e. the core of the database server.

What I find really interesting here is that yes, the amount of code that is to some degree modular has increased. The amount of code that is a MySQL plugin is still very small compared to the server size

Drizzle is 20-25% of the size of a modern MySQL or MariaDB server and for many applications does largely or exactly the same thing.

15 thoughts on “MySQL modularity, are we there yet?

  1. First, this is a very primitive way to measure modularity. The first thing that comes to my mind that is a de-facto module and was not counted is:
    5.5$ wc -l strings/ctype-*.c
    274400 total

    274KLoc – this is just as large as all the storage engines together. Technically, charset is not a plugin – you can’t load one dynamically, etc. I don’t remember anybody requiring that charsets are pluggable, though.

  2. Second: I wasn’t following drizzle very closely, but in MySQL pluginization was not a big success:
    – authentication plugin interface has only a couple of users: pam authentication plugin, and some hello-word-type authenticator. Nobody has bothered to develop anything else.
    – other things that look like they could be plugins – GIS, Fulltext, Thread Pool – ended up not being plugins, because each of them requires its own unique hooks. Which kills the point of being a plugin.

    The only working example of plugin interface I saw is the Storage Engine Interface – there are 3-rd party storage engines, like TokuDB, MariaDB ships with both InnoDB and XtraDB, and the user can choose which one to use, etc.

    Does Drizzle have a plugin interface other than the Storage Engine Interface, for which there are multiple non-trivial plugins?

  3. That’s completely unfair, because Drizzle is MEANT to be modular. And there are so many differences between Drizzle and Maria/MySQL that you might as well compare the codebase of sqlite if you want something that “for many applications does largely or exactly the same thing.”

    Drizzle is NOT a drop-in replacement for MySQL/Maria, so again, you might as well compare with sqlite. If you’re going to rewrite your schema/application anyway….

  4. I think the failure of MySQL and plugins has more to do with the approach taken – it’s not been to then refactor some code out into a plugin (say, use audit interface to implement slow query log or use auth plugin to implement native auth) and instead just throw it out there for “other” uses, with the APIs not being good enough to implement the existing functionality.

  5. Drizzle does have many plugin interfaces, yes. There’s the protocol interface, which we have mysql_protocol, drizzle_protocol, console and json_server (a HTTP server). We also have functions (like UDFs, but the same API as writing one in the server). We have storage engines (of course). We have table functions (which is what you use to implement I_S tables) and all our INFORMATION_SCHEMA and DATA_DICTIONARY tables are implemented using that. You can also have plugins that manage schemas (the default one being one that makes directories on the filesystem, but there’s also one that tells the server of the existence of I_S and D_D). The replication system is pluggable too, so is, of course, authentication. Oh, we also have scheduler plugins.

    The single implementation ones are storing the replication stream and there is currently only the multi_thread scheduler. These are, however, easily replaceable with completely different implementations, and this is the big difference as compared to MySQL plugins

  6. Sheeri, I think you’re right in that too much comparison between MySQL and Drizzle isn’t really the right thing to do. However, the DML compatibility to MySQL with strict_mode is rather high, making it a small-ish step rather than leap from MySQL.

    One thing to think about is: at what point is MariaDB a separate database that is just able to import data from MySQL?

  7. One thing to think about is: at what point is MariaDB a separate database that is just able to import data from MySQL?

    If you consider MySQL vs Drizzle difference to be negligible, then you should not be able to tell the difference between MySQL and MariaDB, all :-)

    If we use the Lines-of-Code argument that was explored in this blog in the last few posts, we get this:
    – MariaDB is different from MySQL just as much as one major release in MySQL from the other.
    – MariaDB’s difference from MySQL peaked in MariaDB 5.3 and decreased in 5.5 revisions :-)

    I’m curious to learn what logic has led you to the conclusion that “mariadb is a separate database that is just able to import data” ?

    There is even no import process… You’re expected to start MariaDB on mysql’s datadir.

  8. Since MariaDB->MySQL merges don’t happen and I’m assuming that MariaDB continues to grow and accept patches, that diff is going to get larger again (as although I haven’t investigated the cause of the drop, my guess is that it’s something to do with MySQL also merging and fixing up some of the same old code that previously wasn’t ready for primetime).

    Also, with features such as dynamic columns it can become very much a one way trip.

  9. Pingback: Is MySQL bigger than Linux? | Ramblings

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.