linux.conf.au 2016 Kernel miniconf CFP

Why yes, it’s another long URL thanks to Google Docs: https://docs.google.com/forms/d/148SieC6vmAxJZ3R5Lz5e1Mb0IM06LNSCt6WNVEwYFcs/viewform

Got a kernel topic you want to talk about? Got a kernel topic you want to start discussion on? Or a Q&A? Submit NOW! We’re going for part sessions, part unconference.

Questions? Contact me at stewart@linux.vnet.ibm.com

PAPR spec publicly available to download

PAPR is the Power Architecture Platform Reference document. It’s a short read at only 890 pages and defines the virtualised environment that guests run in on PowerKVM and PowerVM (i.e. what is referred to as ‘pseries’ platform in the Linux kernel).

https://members.openpowerfoundation.org/document/dl/469

As part of the OpenPower Foundation, we’re looking at ensuring this is up to date, documents KVM specific things as well as splitting out the bits that are common to OPAL and PAPR into their own documents.

OPAL firmware specification, conformance and documentation

Now that we have an increasing amount of things that run on top of OPAL:

  1. Linux
  2. hello_world (in skiboot tree)
  3. ppc64le_hello (as I wrote about yesterday)
  4. FreeBSD

and that the OpenPower ecosystem is rapidly growing (especially around people building OpenPower machines), the need for more formal specification, conformance testing and documentation for OPAL is increasing rapidly.

If you look at the documentation in the skiboot tree late last year, you’d notice a grand total of seven text files. Now, we’re a lot better (although far from complete).

I’m proud to say that I won’t merge new code that adds/modifies an OPAL API call or anything in the device tree that doesn’t come with accompanying documentation, and this has meant that although it may not be perfect, we have something that is a decent starting point.

We’re in the interesting situation of starting with a working system, with mainline Linux kernels now for over a year (maybe even 18 months) being able to be booted by skiboot and run on powernv hardware (the more modern the kernel the better though).

So…. if anyone loves going through deeply technical documentation… do I have a project you can contribute to!

gcov code coverage for OpenPower firmware

For skiboot (which provides the OPAL boot and runtime firmware for OpenPower machines), I’ve been pretty interested at getting some automated code coverage data for booting on real hardware (as well as in a simulator). Why? Well, it’s useful to see that various test suites are actually testing what you think they are, and it helps you be able to define more tests to increase what you’re covering.

The typical way to do code coverage is to make GCC build your program with GCOV, which is pretty simple if you’re a userspace program. You build with gcov, run program, and at the end you’re left with files on disk that contain all the coverage information for a tool such as lcov to consume. For the Linux kernel, you can also do this, and then extract the GCOV data out of debugfs and get code coverage for all/part of your kernel. It’s a little bit more involved for the kernel, but not too much so.

To achieve this, the kernel has to implement a bunch of stub functions itself rather than link to the gcov library as well as parse the GCOV data structures that GCC generates and emit the gcda files in debugfs when read. Basically, you replace the part of the GCC generated code that writes the files out. This works really nicely as Linux has fancy things like a VFS and debugfs.

For skiboot, we have no such things. We are firmware, we don’t have a damn file system interface. So, what do we do? Write a userspace utility to parse a dump of the appropriate region of memory, easy! That’s exactly what I did, a (relatively) simple user space app to parse out the gcov gcda files from a skiboot memory image – something we can easily dump out of the simulator, relatively easily (albeit slower) from the FSP on an IBM POWER system and even just directly out of a running system (if you boot a linux kernel with the appropriate config).

So, we can now get a (mostly automated) code coverage report simply for the act of booting to petitboot: https://open-power.github.io/skiboot/boot-coverage-report/ along with our old coverage report which was just for the unit tests (https://open-power.github.io/skiboot/coverage-report/). My current boot-coverage-report is just on POWER7 and POWER8 IBM FSP based systems – but you can see that a decent amount of code both is (and isn’t) touched simply from the act of booting to the bootloader.

The numbers we get are only approximate for any code run on more than one CPU as GCC just generates code that does a load/add/store rather than using an atomic increment.

One interesting observation was that (at least on smaller systems, which are still quite large by many people’s standards), boot time was not really noticeably increased.

For more information on running with gcov, see the in-tree documentation: https://github.com/open-power/skiboot/blob/master/doc/gcov.txt

Going beyond 1.3 MILLION SQL Queries/second

So, on a large IBM POWER8 system I was recently running the newly coined “yesmark” benchmark, which is best translated as this:

Benchmark (N for concurrency): for i in {1..N}; do yes "DO 0;" | mysql > /dev/null & done
Live results: mysqladmin -ri 1 extended-status | grep Questions

Which sounds all fun until you realize that it’s *amazingly* close in results to a sysbench point select benchmark these days (well, with MySQL 5.7.7).

Since yesmark doesn’t use InnoDB though, MariaDB is back in the game.

I don’t think it matters between MariaDB and MySQL at this point for yesbench. With MySQL in a KVM guest on a shared 2 socket POWER8 I could get 754kQPS and on a larger system, I could get 1.3 million / sec.

1.3 Million queries / sec is probably the highest number anybody has ever seen out of MySQL or MariaDB, so that’s fairly impressive in itself.

What’s also impressive is that on this workload, mysqld was still only using 50% of CPU in the system. The mysql command line client was really heavy user.

Other users are: 8% completely idle, another 12% in linux scheduler (alarmingly high really). So out of all execution time, only about 44% spent in mysqld, 29% in mysql client.

It seems that the current issues scaling to two socked POWER8 machines are the same as with scaling to other large systems, when we go beyond about 20 POWER8 cores (SMT8), we start to find new and interesting challenges.

MySQL 5.7.5 on POWER – thread priority

Good news everyone!

MySQL 5.7.5 is out with a bunch more patches for running well on POWER in the tree. I haven’t yet gone and tried it all out, but since I’m me, I look at bugs database and git/bzr history first.

On Intel CPUs, when you’re spinning on a spin lock, you’re meant to execute the PAUSE CPU instruction. This tells the CPU that other execution threads in the same core should be given priority as you are currently not doing anything productive. Without this, you’re likely going to hurt on hyperthreaded CPUs.

In MySQL, there are custom spinlocks in order to do interesting adaptive mutex things to attempt to squeeze the most performance possible out of modern systems.

One of the (not 100% ready, but close) bugs with patches I submitted against MySQL 5.7 was for using the equivalent of the PAUSE instruction for POWER CPUs. On POWER, we’re a bit different, you can actually set priorities of threads (which may matter more, as POWER8 CPUs can be in SMT8 mode – where there are *eight* executing threads per core).

So, the good news is that in MySQL 5.7.5, the magic instructions for setting thread priority are in! This should mean great things for performance on POWER systems with any of the SMT modes enabled.

The next interesting part of this is how it interacts with other KVM guests on a system. At least on POWER (and on x86 as well, although I won’t go into details here) there’s a hypervisor call that a guest can make saying “hey, I’m spinning here, perhaps you want to make sure other vcpus execute so that at some point I can continue”. On POWER, this is the H_CONFER hcall, where you can basically do a directed yield to another vcpu (the one that holds the lock you’re trying to get is a good idea).

Generally though, it’s only the guest kernel that does this, not userspace. You can see the H_CONFER call in __spin_yield(arch_spinlock_t*) and __rw_yield(arch_rwlock_t*) in arch/powerpc/lib/locks.c in the kernel.

It would be interesting to see what extra we could get out of a system running multiple guests with MySQL servers if InnoDB/MySQL could properly yield to the right vcpu (well, thread I guess).

Tyan OpenPower

Good news everyone! Tyan has announced the availability of their first OpenPOWER system! They call this a Customer Reference System, which means it’s an excellent machine to start poking at OpenPower and POWER8 (or deploying applications on).

Because it’s an OpenPower machine, it runs the open source Open Power firmware (all up on github) and will happily run Linux (feel free to port your other operating system kernels). I’ll be writing more on the OpenPower firmware soon as, well, technical details are fun!

Ubuntu 14.10 is listed as recommended as not only have they been building for POWER8 but have spent some time ensuring things work fairly well out-of-the-box (both as a KVM guest and running native on the bare metal). Or, you can always just boot whatever the mainline kernel is at – build for the POWERNV (POWER non-virtualized) platform (be sure to include all the required drivers) and have fun!

OpenPower firmware up on github!

With the whole OpenPower thing, a lot of low level firmware is being open sourced, which is really exciting for the platform – the less proprietary code sitting in memory the better in my books.

If you go to https://github.com/open-power you’ll see code for a bunch of the low level firmware for OpenPower and POWER8.

Hostboot is the bit of code that brings up the CPU and skiboot both sets up hardware and provides runtime services to Linux (such as talking to the service processor, if one is present).

Patches to https://github.com/open-power/skiboot/blob/master/doc/overview.txt are (of course) really quite welcome. It shouldn’t be too hard to get your head around the basics.

To see the Linux side of the OPAL interface, go check out linux/arch/powerpc/platforms/powernv -there you can see how we ask OPAL to do things for us.

If you buy a POWER8 system from IBM running PowerKVM you’re running this code.

ZFS: could have been the future of UNIX Filesystems

There was a point a few years ago where Sun could have had the next generation UNIX filesystem. It was in Solaris (and people were excited), there was a port to MacOS X (that was quite exciting for people) and there was a couple of ways to run it on linux (and people were excited). So… instead of the fractured landscape of ext3, HFS+ and (the various variations of) UFS we could have had one file system that was common between all of the commonly used UNIX-like variants. Think of being able to use a file system on a removable drive that isn’t FAT and being able to take it from machine to machine (well… Windows would be a problem, but it always is).

There was some really great work done in OpenSolaris with integration between the file manager and ZFS snapshots (a slider bar to browse the history of a directory, an idea I’ve championed for over a decade now, although the Sun implementation was likely completely independently developed). The integration with the package manager was also completely awesome, crash safe upgrades!

However, all this is pretty much moot. Solaris is used by fewer people than ever, it’s out of OS X and BTRFS is going to take the place that ZFS could have held in the Linux world. So, unfortunately, ZFS is essentially dead. This is a shame…. it could have been something huge.

Does linux fallocate() zero-fill?

In an email disscussion for pre-allocating binlogs for MySQL (something we’ll likely have to do for Drizzle and replication), Yoshinori brought up the excellent point of that in some situations you don’t want to be doing zero-fill as getting up and running quickly is the most important thing.

So what does Linux do? Does it zero-fill, or behave sensibly and pre-allocate quickly?

Let’s look at hte kernel:

Inside the fallocate implementation (fs/open.c):

if (inode->i_op->fallocate)
ret = inode->i_op->fallocate(inode, mode, offset, len);
else
ret = -EOPNOTSUPP;

and for ext4:
/*
* currently supporting (pre)allocate mode for extent-based
* files _only_
*/
if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL))
return -EOPNOTSUPP;

XFS has always done fast pre-allocate, so it’s not a problem there and the only other filesystems to currently support fallocate are btrfs and ocfs2 – which we don’t even have to start worrying too much about yet :)

But this is just kernel behaviour – i *think* libc ends up wrapping it
with a ENOTSUPP from kernel being “let me zero-fill” (which might be
useful to check). Anybody want to check libc for me?

This was all on slightly post 2.6.30-rc3 (from git: 8c9ed899b44c19e81859fbb0e9d659fe2f8630fc)

default filesystem and disk parameters are for wusses

I can’t remember the last time i used default mkfs or mount options… oh yeah, that’s right – by accident.

Anyway… I did a little experiment today.

The filesystem is my laptop /home – XFS, 100GB, 95% used (so 5-6GB free), rather aged. This is where a lot of my MySQL development is done. Mkfs options: 128MB log, version2 log. Mount options: logbufs=8, logbsize=256k. All of this geared towards increasing metadata performance.

Why metadata performance? well… source code trees are a lot of metadata :)

So, let’s try some things: cloning a repository and then removing the repository.

Two variables are being tested: mounting the file system with nobarrier (or barrier, the default). Write barriers tell the disk to ensure write order to the platter when write cache is in use. Also testing disabling (or enabling, the default) the disk write cache.

cloneperf1.png

rmperf1.png

NOTE: the last option which has the write cache enabled and write barriers disabled is NOT SAFE. If your machine crashes, you loose data, and potentially your file system ends up corrupted.

So I’m now disabling my disk write cache and mounting with nobarrier.

If you use real disk arrays – e.g. battery backed write cache RAID boxes, the story is likely very different!

Larger inodes make for (some) happy apps

Mikal talks about Ted talking about Tridge talking about how larger inodes can improve samba4 performance. Well, not just Samba4. Beagle and SELinux are also common heaver users of extended attributes which can often be stored inside the inode (e.g. on XFS).

There used to be the case where the Fedora installer would run mkfs.xfs with the default options and enable SELinux. Turns out this setup is great for systems without SELinux but the xattrs were large enough to require more space than what could fit in the inode, causing an extra block per inode to be allocated for xattrs. Not exactly space efficient.

The same can happen with Beagle. So if you’re using SELinux and/or Beagle and/or Samba4 – large inodes are probably going to be a winner for you.

I think we’re getting to the time where xattrs are popping up here there and everywhere for all sorts of applications and we’re going to have to find good (and efficient) ways of storing them.

I’m increasingly warming up to the idea of variable sized inodes. For a FS like XFS this could be done per group of inodes (XFS typically will, when more inodes are needed, create 64 inodes at once). File systems such as ext3 don’t really have this option as there is an inode table that is fixed and created at mkfs time. Although Ted has some interesting ideas for ext4 in this regard.

I’m sure Val Henson would have some interesting ideas for ChunkFS too…  a very interesting concept that I’ve been thinking about the possibility of retrofitting into existing systems (which I don’t think is that silly).

It would be great for some of the XFS dudes to write about the parallelising of checking an XFS file system.

CREATE, INSERT, SELECT, DROP benchmark

Inspired by PeterZ’s Opening Tables scalability post, I decided to try a little benchmark. This benchmark involved the following:

  • Create 50,000 tables
  • CREATE TABLE t{$i} (i int primary key)
  • Insert one row into each table
  • select * from each table
  • drop each table
  • I wanted to test file system impact on this benchmark. So, I created a new LVM volume, 10GB in size. I extracted a ‘make bin-dist’ of a recent MySQL 5.1 tree, did a “mysql-test-run.pl –start-and-exit” and ran my script, timing real time with time.

    For a default ext3 file system creating MyISAM tables, the test took 15min 8sec.

    For a default xfs file sytem creating MyISAM tables, the test took 7min 20sec.

    For an XFS file system with a 100MB Version 2 log creating MyISAM tables, the test took 7min 32sec – which is within repeatability of the default XFS file system. So log size and version made no real difference.

    For a default reiserfs (v3) file system creating MyISAM tables, the test took 9m 44sec.

    For a ext3 file system with the dir_index option enabled creating MyISAM tables, the test took 14min 21sec.

    For an approximate measure of the CREATE performance…. ext3 and reiserfs averaged about 100 tables/second (although after the 20,000 mark, reiserfs seemed to speed up a little). XFS  averaged about 333 tables/second. I credit this to the check for if the files exist being performed by a b-tree lookup in XFS once the directory reached a certain size.

    Interestingly, DROPPING the tables was amazingly fast on ext3 – about 2500/sec. XFS about 1000/sec. So ext3 can destroy easier than it can create while XFS keeps up to speed with itself.

    What about InnoDB tables? Well…

    ext3(default): 21m 11s

    xfs(default): 12m 48s

    ext3(dir_index): 21m 11s

    Interestingly the create rate for XFS was around 500 tables/second – half that of MyISAM tables.

    These are interesting results for those who use a lot of temporary tables or do lots of create/drop tables as part of daily life.

    All tests performed on a Western Digital 250GB 7200rpm drive in a 2.8Ghz 800Mhz FSB P4 with  2GB memory running Ubuntu 6.10 with HT enabled.

    At the end of the test, the ibdata1 file had grown to a little over 800MB – still enough to fit in memory. If we increased this to maybe 200,000 tables (presumably about a 3.2GB file) that wouldn’t fit in cache, then the extents of XFS would probably make it perform better when doing INSERT and SELECT queries as opposed to the list of blocks that ext3 uses. This is because the Linux kernel caches the mapping of in memory block to disk block lookup making the efficiency of this in the file system irrelevant for data sets less than memory size.

    So go tell your friends: XFS is still the coolest kid on the block.

    Disk allocation, XFS, NDB Disk Data and more…

    I’ve talked about disk space allocation previously, mainly revolving around XFS (namely because it’s what I use, a sensible choice for large file systems and large files and has a nice suite of tools for digging into what’s going on).Most people write software that just calls write(2) (or libc things like fwrite or fprintf) to do file IO – including space allocation. Probably 99% of file io is fine to do like this and the allocators for your file system get it mostly right (some more right than others). Remember, disk seeks are really really expensive so the less you have to do, the better (i.e. fragmentation==bad).

    I recently (finally) wrote my patch to use the xfsctl to get better allocation for NDB disk data files (datafiles and undofiles).
    patch at:
    http://lists.mysql.com/commits/15088

    This actually ends up giving us a rather nice speed boost in some of the test suite runs.

    The problem is:
    – two cluster nodes on 1 host (in the case of the mysql-test-run script)
    – each node has a complete copy of the database
    – ALTER TABLESPACE ADD DATAFILE / ALTER LOGFILEGROUP ADD UNDOFILE creates files on *both* nodes. We want to zero these out.
    – files are opened with O_SYNC (IIRC)

    The patch I committed uses XFS_IOC_RESVSP64 to allocate (unwritten) extents and then posix_fallocate to zero out the file (the glibc implementation of this call just writes zeros out).

    Now, ideally it would be beneficial (and probably faster) to have XFS do this in kernel. Asynchronously would be pretty cool too.. but hey :)

    The reason we don’t want unwritten extents is that NDB has some realtime properties, and futzing about with extents and the like in the FS during transactions isn’t such a good idea.

    So, this would lead me to try XFS_IOC_ALLOCSP64 – which doesn’t have the “unwritten extents” warning that RESVSP64 does. However, with the two processes writing the files out, I get heavy fragmentation. Even with a RESVSP followed by ALLOCSP I get the same result.

    So it seems that ALLOCSP re-allocates extents (even if it doesn’t have to) and really doesn’t give you much (didn’t do too much timing to see if it was any quicker).

    I’ve asked if this is expected behaviour on the XFS list… we’ll see what the response is (i haven’t had time yet to go read the code… i should though).

    So what improvement does this patch make? well, i’ll quote my commit comments:

    BUG#24143 Heavy file fragmentation with multiple ndbd on single fs
    
    If we have the XFS headers (at build time) we can use XFS specific ioctls
    (once testing the file is on XFS) to better allocate space.
    
    This dramatically improves performance of mysql-test-run cases as well:
    
    e.g.
    number of extents for ndb_dd_basic tablespaces and log files
    BEFORE this patch: 57, 13, 212, 95, 17, 113
    WITH this patch  :  ALL 1 or 2 extents
    
    (results are consistent over multiple runs. BEFORE always has several files
    with lots of extents).
    
    As for timing of test run:
    BEFORE
    ndb_dd_basic                   [ pass ]         107727
    real    3m2.683s
    user    0m1.360s
    sys     0m1.192s
    
    AFTER
    ndb_dd_basic                   [ pass ]          70060
    real    2m30.822s
    user    0m1.220s
    sys     0m1.404s
    
    (results are again consistent over various runs)
    
    similar for other tests (BEFORE and AFTER):
    ndb_dd_alter                   [ pass ]         245360
    ndb_dd_alter                   [ pass ]         211632

    So what about the patch? It’s actually really tiny:

    
    --- 1.388/configure.in	2006-11-01 23:25:56 +11:00
    +++ 1.389/configure.in	2006-11-10 01:08:33 +11:00
    @@ -697,6 +697,8 @@
    sys/ioctl.h malloc.h sys/malloc.h sys/ipc.h sys/shm.h linux/config.h \
    sys/resource.h sys/param.h)
    
    +AC_CHECK_HEADERS([xfs/xfs.h])
    +
     #--------------------------------------------------------------------
    # Check for system libraries. Adds the library to $LIBS
    # and defines HAVE_LIBM etc
    
    --- 1.36/storage/ndb/src/kernel/blocks/ndbfs/AsyncFile.cpp	2006-11-03 02:18:41 +11:00
    +++ 1.37/storage/ndb/src/kernel/blocks/ndbfs/AsyncFile.cpp	2006-11-10 01:08:33 +11:00
    @@ -18,6 +18,10 @@
    #include
    #include
    
    +#ifdef HAVE_XFS_XFS_H
    +#include
    +#endif
    +
     #include "AsyncFile.hpp"
    
    #include
    @@ -459,6 +463,18 @@
    Uint32 index = 0;
    Uint32 block = refToBlock(request->theUserReference);
    
    +#ifdef HAVE_XFS_XFS_H
    +    if(platform_test_xfs_fd(theFd))
    +    {
    +      ndbout_c("Using xfsctl(XFS_IOC_RESVSP64) to allocate disk space");
    +      xfs_flock64_t fl;
    +      fl.l_whence= 0;
    +      fl.l_start= 0;
    +      fl.l_len= (off64_t)sz;
    +      if(xfsctl(NULL, theFd, XFS_IOC_RESVSP64, &fl) < 0)
    +        ndbout_c("failed to optimally allocate disk space");
    +    }
    +#endif
     #ifdef HAVE_POSIX_FALLOCATE
    posix_fallocate(theFd, 0, sz);
    #endif

    So get building your MySQL Cluster with the XFS headers installed and run on XFS for sweet, sweet disk allocation.

    Twinhan USB DTV dongle not working :(

    so after doing some researching (read: using search engines with linux + product name), I came to the conclusion that a Twinhan USB2.0 DVB dongle would be the dongle for me. Yes – it’s small, compact and does digital tv without requiring a non-existant free PCI slot in my Shuttle MythTV box.

    Having had great success with my last bit of new hardware (a really cheap Logitech QuickCam Express or something) – plug it in and it “just works”. Oh Linux how you are better than Microsoft Windows for hardware usability!

    But this was not to be. It uses a vp7045 chipset, which has drivers both in Ubuntu 6.06 “Dapper” and in the latest v4l-dvb hg tree.

    But for the life of me I couldn’t get it to tune into any TV stations (for those of you who like using hardware and not just having expensive boxes around, you will appreciate how tuning into a TV station is rather important functionality for a TV card). So I started having a look around the interweb for possible answers.

    The best I could come up with was “are you sure you have all the cables plugged in” – yes, I was.

    So seeing as this is the first digital TV dongle in this house, I wondered if the signal just wasn’t getting here. I got a friend to bring around a spare digital set top box. It worked fine. Brilliantly in fact – it even worked with the shitty small antenna that came with the dongle. So it wasn’t an ability to receive.

    I then came across this post to the linux-dvb list titled “New VP7045 with TDA10046 instead of MT352 (was: VP7045 tuner doesn’t work)”. Which really does hint at the problem!

    I could be one of the lucky ones with a new revision that uses the TDA10046 instead of the MT352! (after getting some debug info from the card out of the driver – it was reporting itself as v1.02, so quite possible).

    Maybe time to hack the dvb driver for it? Things seem pretty modular, so it couldn’t be too hard, right?

    Well, the vp7045-fe.c file is the front end (well, what it assumes is the front end) for the vp7045.c dongle. So all I really need to do is to get it to use the tda10046 frontend (under frontends/tda1004x.c) instead of the vp7045-fe.c fe code.

    Well, it seems as though the tda10046 is an i2c device while the vp7045-fe isn’t. Hrrm… I’ve never really done much with i2c, so this’ll be fun!

    I’ve currently managed to hack the driver so that we do some things to do with the tda chip – although i haven’t gotten in detecting the i2c adapter – which means we’re never going to get a front end! (in fact, when you plug in the device with my modified driver you get a “no frontend detected” message from the kernel).

    i’ve tried poking on the #linuxtv channel on freenode to no avail – so it seems like i’m on my own for a bit.

    A good way to spend midnight until 3am though :)

    I’ll probably end up doing the same tonight. Why? Because it’s just so much fun.

    Oh, and if anybody has any pointers – it would be appreciated.

    I am, of course, assuming the hardware itself isn’t faulty. I have no MS Windows system around to test on.

    Arjen’s MySQL Community Journal – HyperThreading? Not on a MySQL server…

    Arjen’s MySQL Community Journal – HyperThreading? Not on a MySQL server…

    I blame the Linux Process Scheduler. At least it’s better than the earlier 2.6 days where things would get shunted a lot from one “cpu” to the other “cpu” for no real reason.

    Newer kernel verisons are probably better… but don’t even think of HT and pre-2.6 – that would be funny.

    DaveM on Ingo’s SMP lock validator

    DaveM talks about Ingo’s new SMP lock validator for linux kernel

    A note reminding me to go take a look and see what can be ripped out and placed into various bits of MySQL and NDB. Ideally, of course, it could be turned into a LD_PRELOAD for pthread mutexes.

    Anybody who wants to look deeper into it before I wake up again is welcome to (and tell me what they find)

    Beat on “state of the dolphin” (or: Why Software is never really ready until a .20 release)

    Beat Vontobel blogs about “fuþark: The silence of futhark and the state of the dolphin” which is basically about how he’s found that the 5.0.20 release of MySQL is when the 5.0 release is really starting to shine.

    This confirms my theory (that I’ve had for quite a while now… like years) that a software release is never really mature until it hits about .20 (that’s dot twenty, not dot two).

    When something reaches .10 (dot ten) it’s no longer going to be annoying for most uses, but .20 means that you’re going to be happy. Don’t ask me really why this is the case, but it is.

    Think about the 2.6 kernel (yes, Linux Kernel – honestly, you think i was talking about something else?). At about 2.6.10, it would no longer be a pain to use and get things going – everything was starting to be smooth. As we’re getting closer to .20, things are getting better too. Mind you, everything here does run 2.6 now (and so does my mum’s machine – which is always a good sign of something being ready). With 2.4 hitting .20 – you’d never even think about using 2.2, 2.4 was perfect (except when you wanted 2.6).

    GNOME (and everything attached to it) is getting to be a really good desktop – ever since about the 2.10 release I’ve been using just much more of the GNOMEy way of doing things because they’re actually getting useful and usable (don’t get me wrong, previous releases were good too – but a lot more things annoyed me). As the releases have progressed, I’m increasingly convinced that 2.20 will be the “we’re here” release. 2.14 is a lot better, but there’s still a bunch of stuff that has to be done before it’s totally kick-ass.

    There are no surprises in MySQL 4.0 (it’s past .20 – at .26 now). Everybody knows and trusts it. 4.1 is at 4.1.18 – which is about as good as a .20 and it’s a pretty happy release. But due to 4.0 being rather solid – a lot of people have just stuck there. We’re seeing a bunch move to 5.0 – but my theory is that this will be 5.0.20 or above. Hrrm… anybody see a pattern?

    MySQL 5.1 is at 5.1.10 (or so) and it’s stopped being annoying, and that great march towards a .20 is healthy and active.

    GCC 2.95 had a lot of respect for a very long time (now it’s just a bit old). Note that .95 is higher than .20 :)

    EMACS is at version 21, but ed is only at .2 (hrrm.. and which is used by more people as their editor i wonder).

    aptitude at 0.2.15 (getting to .20) – while apt is at 0.6.40 (above .20). RPM is only at 4.0.4 – so a bit to go there :)

    The version of postgresql is 7.5.9 over here… so getting to the .1 stage, but away from the .20. (now I’m going to watch comments fill up with postgesql guys going on about something, i just know it :) But there is 7.3.14 – a lot closer to .20!

    MythTV is at 0.19 – getting closer to the .20 release (it’s a lot better than even just a few releases ago).

    (versions here mostly taken from whatever ubuntu 5.04 has)

    Note that attempting to skip a whole bunch of versions and label your software 95, 98, 2003 or whatever doesn’t get you “.20” status. Neither does just skipping to “.20” automatically. It’s about hard work and removing annoying things (we tend to call them bugs).

    This is a really stupid metric of software maturity. It is, however, disturbingly accurate.

    really unstable laptop

    I’m currently getting hard crashes about five times a day.

    I thought it was the sound driver, as i got a crash during dist-upgrade (again) while on console and saw the backtrace. Basically looked like something bad happenned when the sound was muted.

    So, running without sound muted – just turned down.

    Well, today, just crashed again. Since running X, no backtrace. ARRRGHHH.

    Also crashed when waking up too. ACPI stuff in the backtrace.

    Not a happy camper at the moment. I have work to do, not futzing around with trying to find out what the fuck is wrong with my laptop (probably software) when I should be running a stable system.

    I’ve already have to re-add all my liferea RSS feeds as liferea obviously isn’t doing the right thing (at least the version shipping with Ubuntu) regards writing the feeds file to disk.

    So, I’m trying to prepare presentations for our DevConf on an incredibly buggy and almost unusable OpenOffice.org on an unstable laptop.

    I think I’m going to have wine again with lunch.