Are MariaDB tests adding anything extra over Oracle MySQL tests?

I grabbed all the tests introduced in MariaDB 5.5.32 (i.e. “bzr diff -rtag:mariadb-5.5.31..mariadb-5.5.32 mysql-test/” and some foo) and threw them in their own test file. I only kept tests for crashing bugs and ignored those that required plugins (there were two or three, but nothing major). So now I have a test file that should crash MariaDB 5.5.31 and probably before. But, the question is: does this crash Percona Server or MySQL?

While it is excellent to see the MariaDB guys including tests for their crashing bugs, are these MariaDB specific or do they affect other MySQL flavours?

I built a release build of top of trunk Percona Server and ran the test against it. I got no crashes. In a debug build, I got two. One was to do with REPAIR on an ARCHIVE table and the other was “SELECT UNIX_TIMESTAMP(STR_TO_DATE(‘2020′,’%Y’));”. I found the same thing for a debug build of top of tree MySQL.

All the other tests for crashing bugs, of which there were 14 – were MariaDB specific. So, out of 16 total, only 2 applied to Percona Server and MySQL.

Record autotest numbers for NDB

So, with a bunch of recent tests I added (and some bugs that have been fixed) we’re now consistently getting 203 or 204 passing tests. We’ve got typically around 8 or 9 that often fail – often because the test may be broken or not quite deterministic. Or there’s a bug… :)

(all numbers for the daily-basic list of tests for various 5.1 branches).

It would be great to hit 300 by this time next year… which means a lot of test cases… hrrm… anybody want to volunteer?

Code size of an engine versus test suite

If you count the lines of code in the MySQL Cluster (NDB) test suite (mysql-5.1/storage/ndb/test – and exclude the old ODBC stuff) you come up with about 104000 lines of code. This is in contrast to the approximate other 350,000 lines of code for the NDB engine (excluding the handler, which is an additional 12,000 lines – this isn’t tested much by the NDB test suite… is meant to take care of a lot of that).

If you go and check the MyISAM tree, it’s only 40545 lines of code – for the entire engine. That’s right, the MySQL Cluster test suite is about 2.5 times the size of MyISAM.

If you look at tests, which are just lists of SQL commands with static data, it comes up at 250,000 lines (that excludes result files). The NDB tests do things programmatically – so can generate large amounts of data and different loads quite easily.

The architecture of the NDB tests (commonly referred to as autotest, ATRT or HUGO framework) is very different from – it easily allows you to write a test that is high on concurrency, high on load and high on amount of data. It also is modular, so that when you get an issue from a customer (or need to do some benchmarking on a speficic type of schema) you can use the utility programs to help you (e.g. there’s one that does random PK updates to tables, one that does scans, one that does index operations etc).

There’s this whole bunch of things you just cannot do with

Then we get to fault injection… MySQL Cluster is a distributed system that is designed to withstand failure. Without testing this, we can never say it’s remotely HA. So we test it. A lot. We inject failures into nodes to check our node failure handling, using the utility programs and some basic shell it’s possible to do custom tests (such as multi-node failure)  where our test suite doesn’t have the best coverage yet.

Again, either not possible or extremely hard with

mysql_slap is the hint of a nice utility to help in testing… but using it in scripts in a verifyable way (i.e. check what came out is what went in, using a variety of access methods – full table scans, pk scans, index scans, stored procs, cursors, views, joins etc) is tricky at best (but really impossible).

Yes, I’m really pining for a better test suite infrastructure for the MySQL Server – it can only lead to better quality software…. almost somebody just rewriting a bunch of the hugo classes to use the MySQL C API would be useful.

Disk allocation, XFS, NDB Disk Data and more…

I’ve talked about disk space allocation previously, mainly revolving around XFS (namely because it’s what I use, a sensible choice for large file systems and large files and has a nice suite of tools for digging into what’s going on).Most people write software that just calls write(2) (or libc things like fwrite or fprintf) to do file IO – including space allocation. Probably 99% of file io is fine to do like this and the allocators for your file system get it mostly right (some more right than others). Remember, disk seeks are really really expensive so the less you have to do, the better (i.e. fragmentation==bad).

I recently (finally) wrote my patch to use the xfsctl to get better allocation for NDB disk data files (datafiles and undofiles).
patch at:

This actually ends up giving us a rather nice speed boost in some of the test suite runs.

The problem is:
– two cluster nodes on 1 host (in the case of the mysql-test-run script)
– each node has a complete copy of the database
– ALTER TABLESPACE ADD DATAFILE / ALTER LOGFILEGROUP ADD UNDOFILE creates files on *both* nodes. We want to zero these out.
– files are opened with O_SYNC (IIRC)

The patch I committed uses XFS_IOC_RESVSP64 to allocate (unwritten) extents and then posix_fallocate to zero out the file (the glibc implementation of this call just writes zeros out).

Now, ideally it would be beneficial (and probably faster) to have XFS do this in kernel. Asynchronously would be pretty cool too.. but hey :)

The reason we don’t want unwritten extents is that NDB has some realtime properties, and futzing about with extents and the like in the FS during transactions isn’t such a good idea.

So, this would lead me to try XFS_IOC_ALLOCSP64 – which doesn’t have the “unwritten extents” warning that RESVSP64 does. However, with the two processes writing the files out, I get heavy fragmentation. Even with a RESVSP followed by ALLOCSP I get the same result.

So it seems that ALLOCSP re-allocates extents (even if it doesn’t have to) and really doesn’t give you much (didn’t do too much timing to see if it was any quicker).

I’ve asked if this is expected behaviour on the XFS list… we’ll see what the response is (i haven’t had time yet to go read the code… i should though).

So what improvement does this patch make? well, i’ll quote my commit comments:

BUG#24143 Heavy file fragmentation with multiple ndbd on single fs

If we have the XFS headers (at build time) we can use XFS specific ioctls
(once testing the file is on XFS) to better allocate space.

This dramatically improves performance of mysql-test-run cases as well:

number of extents for ndb_dd_basic tablespaces and log files
BEFORE this patch: 57, 13, 212, 95, 17, 113
WITH this patch  :  ALL 1 or 2 extents

(results are consistent over multiple runs. BEFORE always has several files
with lots of extents).

As for timing of test run:
ndb_dd_basic                   [ pass ]         107727
real    3m2.683s
user    0m1.360s
sys     0m1.192s

ndb_dd_basic                   [ pass ]          70060
real    2m30.822s
user    0m1.220s
sys     0m1.404s

(results are again consistent over various runs)

similar for other tests (BEFORE and AFTER):
ndb_dd_alter                   [ pass ]         245360
ndb_dd_alter                   [ pass ]         211632

So what about the patch? It’s actually really tiny:

--- 1.388/	2006-11-01 23:25:56 +11:00
+++ 1.389/	2006-11-10 01:08:33 +11:00
@@ -697,6 +697,8 @@
sys/ioctl.h malloc.h sys/malloc.h sys/ipc.h sys/shm.h linux/config.h \
sys/resource.h sys/param.h)

# Check for system libraries. Adds the library to $LIBS
# and defines HAVE_LIBM etc

--- 1.36/storage/ndb/src/kernel/blocks/ndbfs/AsyncFile.cpp	2006-11-03 02:18:41 +11:00
+++ 1.37/storage/ndb/src/kernel/blocks/ndbfs/AsyncFile.cpp	2006-11-10 01:08:33 +11:00
@@ -18,6 +18,10 @@

+#ifdef HAVE_XFS_XFS_H
 #include "AsyncFile.hpp"

@@ -459,6 +463,18 @@
Uint32 index = 0;
Uint32 block = refToBlock(request->theUserReference);

+#ifdef HAVE_XFS_XFS_H
+    if(platform_test_xfs_fd(theFd))
+    {
+      ndbout_c("Using xfsctl(XFS_IOC_RESVSP64) to allocate disk space");
+      xfs_flock64_t fl;
+      fl.l_whence= 0;
+      fl.l_start= 0;
+      fl.l_len= (off64_t)sz;
+      if(xfsctl(NULL, theFd, XFS_IOC_RESVSP64, &fl) < 0)
+        ndbout_c("failed to optimally allocate disk space");
+    }
posix_fallocate(theFd, 0, sz);

So get building your MySQL Cluster with the XFS headers installed and run on XFS for sweet, sweet disk allocation.

DaveM on Ingo’s SMP lock validator

DaveM talks about Ingo’s new SMP lock validator for linux kernel

A note reminding me to go take a look and see what can be ripped out and placed into various bits of MySQL and NDB. Ideally, of course, it could be turned into a LD_PRELOAD for pthread mutexes.

Anybody who wants to look deeper into it before I wake up again is welcome to (and tell me what they find)