Fun with Coverity found bugs (Episode 1)

Taking the inspiration of  great series of blog posts “Fun with Bugs” (and not http://funwithbugs.com/ which is about both caring for and eating bugs), and since I recently went and run Coverity against Drizzle, I thought I’d have a small series of posts on bugs that it has found (and I’ve fixed).

An idea that has been pervasive in the Drizzle project (and one that I rather subscribe to) is that there is two types of correct: correct and obviously correct. Being obviously correct is much, much better than merely being correct.

The first category of problems that Coverity found was kind of interesting, there was a warning that data_file_name and index_file_name in class ha_myisam weren’t initialized in the ha_myisam constructor nor in any function that it calls. It turns out that this was basically because the code wasn’t exactly optimal, and these variables were used kind of oddly. In fact, in writing this blog post I went back and found that there’s a bunch of extra dead code and these should just be removed, along with the code that “used” them.

The historical use for data_file_name and index_file_name were that (in MySQL) you could specify different paths for MyISAM data and index files, so that the FRM ended up in the server datadir, the data file ended up some other place and the index file was off behind the sofa. Since MyISAM is used only for temporary tables in Drizzle, this is entirely not needed.

Another place where a similar bug was found by Coverity was in the SQLExecutor class of the json_server plugin. The _err variable wasn’t initialized in the constructor. After some careful auditing, I think this was actually a false positive as it was set to something before being used, but it was pretty simple to prevent future bugs by initializing it.

Two instances of the same warning, one just found a bunch of code to delete (rather useful) and the other is rather minor but may help someone in the future.

Coming up next: total embarrassment bugs.

Coverity scan for Drizzle

Coverity is a static analysis tool, which although proprietary itself does offer a free scanning service for free and open source software (which is great by the way, I totally owe people who make that happen a frosty beverage).

Prompted by someone volunteering to get MariaDB into the Coverity Scan, I realized that I hadn’t actually followed through with this for Drizzle. So, I went and submitted Drizzle. As a quick overview, this is the number of problems of each severity both projects got back:

Severity MariaDB Drizzle
High 178 96
Medium 1020 495
Low 47 52

I don’t know what MySQL may be, but it’d be great to see this out in the open too.

Being highly irresponsible (or, HOWTO DoS nearly all RDBMSs)

In my linux.conf.au 2013 talk, I had a big slide telling the audience how to do a simple Denial of Service attack against a MySQL server (post login). This was only one example of many others I could give, but I think it’s the simplest, and only requires the mysql command line tool and a single command. FYI, this also applies to PostgreSQL but I’ll leave the specifics up to somebody else to write.

There is a fundamental flaw in just about all MVCC databases that leaves a giant Denial of Service attack hole. It is the following: START TRANSACTION WITH CONSISTENT SNAPSHOT followed by a bunch of waiting. Sine the database server has to maintain this read view, InnoDB will continue to grow UNDO until it has to extend the ibdata1 file (system table space).

It’s important to remember that you cannot shrink the system table space (unlike with file-per-table where you can just do ALTER TABLE for any individual table suddenly finding itself a lot smaller).

As UNDO grows, InnoDB will faithfully expand the system table space until ENOSPC and then everything will fall in a heap.

In theory, you could have a system table space that doesn’t auto-extend, but then you’re relying on code paths to error out gracefully that I can pretty much bet you are completely untested.

The only real way to avoid this is doing both of the following:

  1. Use kill-idle-transactions feature from Percona Server
  2. have a script that checks for long running transactions and just kills them.

Similar things affect just about any MVCC database system. You’ll also see similar things with file system and volume manager snapshots.

So is it highly irresponsible pointing this out? Of course it isn’t, this should be pretty well known to most DBAs already and so should a whole bunch of other things. Remember all the things you saw in production and then went to hit your developers over the head for? Well, they’re all in this same category.

Go run giant UPDATEs, DELETEs or ALTER TABLE on a giant table in a replication setup, you’ll pretty much DoS your app as everything can’t get up to date read-only information from slaves.

Considering that this is merely scratching the top of the iceberg of ways to DoS a database server, keeping post authentication crashing bugs secret just seems… well… futile, even if you do accept security through obscurity as valid.

Further reading: