Ghosts of MySQL Past, part 11: Why are you happy about this?

This is part 11 in what’s shaping up to be the best part of a 6 week series (Part 1, 2, 3, 4, 5, 6, 7, 7.1, 8, 8.1, 9 and 10) on various history bits of MySQL, somewhat following my LCA2014 talk (video here).

One of my favorite MySQL stories is the introduction of the Random Query Generator – or randgen. While I’ve talked for a while now about the quality problems that were plaguing MySQL, there were positive steps going on. The introduction of continuous regression testing with pushbuild, increasing test case coverage and a focus on bug fixing were all helping. However… a database server is a complex beast that can accept arbitrary queries and transactions and at the very least is expected not to segfault. How on earth do you test it beyond simple manual testing (or automated manual testing: running the same set of static queries)?

Well, at some point in development, David Axmark (one of the founders of MySQL AB) decided that if they were going to be a real database, they should have a test suite – so he wrote one. It was comprised of two parts: a small binary named mysqltest which basically read a file of SQL, ran it against the server and then displayed the results (and also had a couple of small bits of language such as variables and loops) and the other half was a shell script that started a mysqld and ran mysqltest for each .test file in the mysql-test/t/ directory and diffed the output against the mysql-test/r/*.result files.

This system still exists today, although the shell script has been replaced by a perl script which has been replaced with mysql-test-run.pl version 2 which dealt with replication, starting MySQL Cluster daemons, running tests in parallel and dealt with Windows a lot better. The mysqltest binary is largely unchanged however, and so the now MUCH larger suite of SQL based tests would be familiar a dozen years ago, when the mysql-test suite was in infancy.

But how do you ensure you’re testing enough of the server? Well… you can do code coverage analysis and write more tests manually, but wouldn’t it be much better if you could automatically generate tests? Wouldn’t it be great if you could automatically find a set of queries that caused the server to crash? Or a set of queries where the results differed between versions or other database servers (which would strongly indicate that somebody was buggy)?

Well, it turns out that Microsoft had a similar problem back in the 1990s with Microsoft SQL Server and a system called RAGS was used in the development of Microsoft SQL Server 7.0 (released in 1998).

So, fast forward about ten years and a relatively new hire of MySQL AB, Philip Stoev was working on something called “RQG” or Random Query Generator for MySQL AB. The basic idea was to be able to supply a grammar of what type of queries you would like it to generate, point it at a MySQL server and see what happens.

It turns out that what happened was that after a while, the MySQL server would segfault. To a room of MySQL Server developers, this was a pretty amazing advancement in testing the server – an automated way to find new bugs!

Throwing the Random Query Generator at the in-progress MySQL 6.0 optimizer caused a new innovation: automatic stack trace deduplication and test case minimization. RQG would automatically pair down the list of generated queries to the minimal set required to crash the server, so that you could import that set of queries into the existing mysqltest based regression test suite.

The dataset that RQG would work against was rather fixed, it was about 10 already prepared tables with some easily generated data. In MySQL Cluster, we had a set of C++ classes that ran a multithreaded application that would use integer columns as a checksum for the whole row, so updates would also update the checksum and at the end of your test (which may cause node failures in various scenarios) you can check that each row is intact and hasn’t been corrupted. So, IIRC it may have been me (or Jonas, I honestly cannot remember) who asked if such a feature would be added to RQG and the reply was “when I can no longer crash the server in 4 or 5 queries… I’ll think about it”.

So, developers expressed happiness, and Philip expressed this: “Why are you happy about this? We are about 10 years behind Microsoft in QA, why are you possibly happy about this?”

So with this great new tool, the true state of the current server version, the in-development server version (6.0) and all the optimizer enhancements it was bringing, the new Falcon storage engine and the in progress Maria (now Aria) storage engine would be exposed.

The news was not good.

(although, in the long run, the news was really good, and it’s because of RQG that we now have a really solid MySQL Server)

Further reading:

The joy of Unicode

So, back in late 2008, rather soon after we got to start working on Drizzle full time, someone discovered unicodesnowmanforyou.com, or:

☃

Since we had decided that Drizzle was going to be UTF-8 everywhere,(after seeing for years how hard it was for people to get character sets correct in MySQL) we soon added ☃.test to the tree, which tried a few interesting things:

CREATE TABLE ☃; CREATE DATABASE ☃; etc etc

Because what better to show off UTF-8 than using odd Unicode characters for table names, database names and file names. Well… it turns out we were all good except if you attempted to check out the source tree on Solaris. It was some combination of Python, Bazaar and Solaris that meant you just got python stacktraces and no source tree. So, if you look now it’s actually snowman.test and has been since the end of 2008, because Solaris 10.

A little while later, I was talking to Anthony Baxter at OSDC in Sydney and he mentioned Unicode above 2^16 in UTF-8…. so, we had clef.test (we’d learned since ☃ and we were not going to tall it 𝄢.test).

Fast forward a few years to, well, this week, and I was talking to Jeremy Kerr about petitboot and telling the tail of snowman.test. So out came the crazy Unicode characters:

  • U+1F4A9 PILE OF POO 💩
  • U+1F435 MONKEY FACE 🐵
  • U+1F431 CAT FACE 🐱
  • U+1F602 FACE WITH TEARS OF JOY 😂
  • U+1F639 CAT FACE WITH TEARS OF JOY 😹

But guess what, there is no MONKEY FACE WITH TEARS OF JOY! I know, this is just unacceptable – TEARS OF JOY should be a modifier, because you may need U+1F6B9 MENS SYMBOL 🚹 with a TEARS OF JOY modifier at some point in your life.

Anyway, another place with tests involving odd Unicode characters is good for everyone, but still lacking if you need to boot an Operating System that’s MONKEY FACE WITH TEARS OF JOY.

Ghosts of MySQL past, part 10

At the end of 2007, the first alpha of MySQL 6.0 was released. Alpha is the key word here, 5.0 was in decent shape by this stage, 5.1 was not ready and there were at least rumors of a MySQL 6.1 tree. I’ll go into MySQL 6.0 later, but since you’re no doubt not running it, you can probably guess the end of that story.

The big news of 2008 was that at the last MySQL AB all company meeting, the first and last one to occur while I was with the company, in January 2008 in Orlando, Florida, it was announced that Sun Microsystems was going to acquire MySQL AB for the sum of ONE BILLION DOLLARS. Unfortunately, there were no Dr Evil costumes at the event. However, much to the amusement of the Sun employees who were present at the announcement, there was Vodka.

So, while the storage engine project that was to be the savior of and the future of the company, Falcon, had the same name as a Swedish beer (that I’ve actually not seen outside Sweden), the code name for the Sun acquisition was Heineken, and there were a few going around in celebration after the acquisition was announced.

IMG_20140310_111449I cannot remember where I got this badge (or button if you speak American), but they were going around for a little while, and it was certainly around the Sun era, possibly either at a User Conference or around the Sun acquisition.

Sun was very interested in Falcon. With the new T1000 SPARC processor, with eleventy billion threads over several cores and squarely aimed at internet/web companies, it was also thought there could be some.. err.. synergy there.

So what happened with the Sun acquisition?

Well, we soon discovered that although Sun made a lot of noise as to its commitment to free and open source software, its employment contract, IP agreements and processes were sorely not what we expected. There is, in fact, probably a good book in the process of MySQL AB engineers joining Sun, but the end result ended up being rather good. We changed Sun and made it a whole lot easier for anybody inside Sun to contribute to open source projects.

There was some great people inside Sun that we were able to work with, yet there was also a good dose of WTF every so often as well. There was a great “benchmark” that showed what kind of performance you could get out of a T2000 with MySQL once – you just had to run 32 instances of mysqld for it to scale!

One thing Sun had worked out how to do was release software that worked relatively well. Solaris was known for many things: Slowaris, ancient desktop environment, a userspace more suited to 1994 than 2004 but one thing it was not known for was awful quality of releases.

Solaris required code to work before it was merged into the main branch and if code wasn’t ready to be released, it wouldn’t be merged, missing the train. While this is something that would seem obvious today, it was kind of foreign to MySQL – and it would take a while for MySQL to change its development process.

The biggest problem with Sun was that it was not making money. If you want to know what was most wrong with the company, talk to one of the MySQL sales people who went across to Sun. It was basically near impossible to actually sell something to someone. Even once you had someone saying “I want to give you money for these goods and/or services”, actually getting it through the process Sun had was insane. At the same time I ran a little experiment: how easy was it to buy support for Red Hat Enterprise Linux versus Solaris… you could guess which one had a “buy” button on redhat.com and which one was impossible to find on sun.com.

We (of course) got extra tools, resources and motivation to have MySQL work well on Solaris, especially OpenSolaris which was going to be the next big thing in Solaris land.

I should note, the acquisition did not make millionaires out of all the developers. It did make some VCs some money, as well as executives – but if you look around all the developers at MySQL AB, not a single one has subsequently called in rich.

Ghosts of MySQL Past, Part 9: BEST. Team. Name. EVER.

(This is part 9 in a series, part 8 is here – because reverse chronological order totally makes sense here)

So, back around 2007, somebody noticed that an awful lot of the downloads of MySQL and associated utilities from mysql.com were for Windows. Of course, it’s then immediately pointed out that the vast majority of Linux users will not be heading to mysql.com to download MySQL, instead using the packages from their distribution.

However, the number of people working on MySQL who had ever even attempted to compile the MySQL server on Windows when given the number of mysql users on Windows was… well…. rather embarrassing. This is very common with free and open source software – especially historically.

If you look back 10 years, Linux on the desktop was “next year will be the year of Linux on the desktop”, you know, just like how it is now. Except that while today, everything “just works” on modern Linux distros and it is Windows that is an absolute arse to get all the drivers for your hardware going, it was not that long ago that it was the other way around. Also, back then it was much more common for it to be hard to get FOSS into companies… because of a variety of reasons that weren’t valid, but were something we had to spend time explaining back then.

I’m trying to remember the MySQL developers at the time who I knew attempted to do all their MySQL Server development on Windows…. I can think of Reggie along with Vlad and Iggy (both of who joined later). It was quite rightly pointed out that the MySQL experience on Windows was very much one of “UNIX application ported to Windows” rather than “Windows server application, that also comes in a UNIX version.”

MySQL basically did absolutely nothing the way you’d expect a piece of Windows server software to do it. So, a team was put together. Well, more of a task force.

The Windows Task Force (yes, WTF) was born. It is also the best team name in the history of team names and gives me a good chuckle to this day.

Many things came out of this, including (IIRC) an Installer that we actually could find the source code for and didn’t require Delphi to build, the ability to build the server on Windows from a bzr source tree (and not need Linux/UNIX around to build some parts of the server) and making the server behave a bit more like what you’d expect if you were a Windows administrator. There were also a bunch of limits lifted due to the way that MySQL was ported to Windows, which I won’t go into here.

Ghosts of MySQL past, part 8.1: Five Years

With many apologies to David Bowie, come 2009 it was my 5 year anniversary with Sun (well, MySQL AB and then Sun). Companies tend to like to talk about how they like to retain employees and that many employees stay with the company for a long time. It is very, very expensive to hire the right people, so this is largely a good plan.

In 2009, it was five years since I joined MySQL AB – something I didn’t quite originally expect (let’s face it, in the modern tech industry you’re always surprised when you’re somewhere for more than a few years).

The whole process of having been with sun for 5 years seemed rather impersonal… it largely felt like a form letter automatically sent out with a certificate and small badge (below). You also got to go onto a web site and choose from a variety of gifts. So, what did I choose? The one thing that would be useful at Burning Man: a camelbak.

5 year badge for Sun Microsystems

5 year badge for Sun Microsystems

Ghosts of MySQL Past, Part 8: The First Fork.

This is the 8th installment in the rather long series that started with Part 1 about a month ago.

Back in 2006, we were in the situation where MySQL 5.0 had taken forever, and the first “GA” release was not suitable for production. Looking towards MySQL 5.1, it was also unlikely to be out any time soon. The MySQL Cluster team had customers that needed new features in a stable release. The majority of users didn’t use the MySQL server at all, they directly used the C++ NDB API for the vast majority of queries – so the vast majority of release blocker bugs in the MySQL server would not affect the production readiness of MySQL Cluster for these customers.

So, the decision was wisely made to do separate releases from a separate tree for MySQL Cluster. This was named MySQL Cluster: Carrier Grade Edition and exists to this day.

The main use case for MySQL Cluster at this time was running telephone networks, specifically the Home Location Registry databases of GSM phone networks. Basically, you need to keep a database of which tower each subscriber is associated with so when you go to make a phone call (or SMS) the network can properly route the call. This means there’s some realtime response requirements and hardcore availabilty which demands a special type of database.

NDB has a long history (some of it detailed in a previous post), but for those kind of interested in internals, I’ll quote Frazer Clement (now a long time MySQL Cluster developer although I completely forget which year he joined the team, which is just slightly embarrassing):

…Erlang and Ndb Cluster share some Plex heritage, which can still be seen in their architectures today. Since Plex, Erlang has mated with Prolog, and Ndb Cluster was involved in a car crash with C++.

The customers for MySQL Cluster were not the ones who bought the $5000 support option… Typically, paying for the addition of a relatively major feature was considered quite okay and even “normal”. After all, the effort that goes into constructing the entire software stack of a large cell phone network is rather large, and MySQL Cluster would end up being a very small part of that.

Many major features went into MySQL Cluster first, sometimes years before they made the general MySQL Server: row based replication, circular replication with conflict resolution and online DDL (add/drop index and column). In fact, it kind of incredibly frustrates me that we solved online add column in NDB so many years ago and you still can’t add a column to an InnoDB table without some serious planning.

The key thing about MySQL Cluster releases? They happened. They were also regular, addressing customer issues and new major versions brought features that worked for the use cases of those who needed them.

Interestingly enough, you may recognize the person who ran the MySQL Cluster team as the same person who has overseen the now regular release cycle of the MySQL Server itself (and has now been in the MySQL world for over ten years).

Ghosts of MySQL Past, Part 7.1: Hey look, I found an old business card

Digging through piles of old stuff in the house, I came across a nice little artifact of MySQL history, an old business card. One benefit of a small company is that you can tend to get your first name @company.com, which of course, I had.

IMG_20140310_111500I wonder which other MySQL AB employees still have a business card laying around somewhere?