Posts Tagged ‘code’

A tale of a bug…

Thursday, July 22nd, 2010

So I sometimes get asked if we funnel back bug reports or patches back to MySQL from Drizzle. Also, MariaDB adds some interest here as they are a lot closer (and indeed compatible with) to MySQL. With Drizzle, we have deviated really quite heavily from the MySQL codebase. There are still some common areas, but they’re getting rarer (especially to just directly apply a patch).

Back in June 2009, while working on Drizzle at Sun, I found a bug that I knew would affect both. The patch would even directly apply (well… close, but I made one anyway).

So the typical process of me filing a MySQL bug these days is:

  • Stewart files bug
  • In the next window of Sveta being awake, it’s verified.

This happened within a really short time.

Unfortunately, what happens next isn’t nearly as awesome.

Namely, nothing. For a year.

So a year later, I filed it in launchpad for MariaDB.

So, MariaDB is gearing up for a release, it’s a relatively low priority bug (but it does have a working, correct and obvious patch), within 2 months, Monty applied it and improved the error checking around it.

So MariaDB bug 588599 is Fix Committed (June 2nd 2010 – July 20th 2010), MySQL Bug 45377 is still Verified (July 20th 2009 – ….).

(and yes, this tends to be a general pattern I find)

But Mark says he gets things through… so yay for him.2

Shocked and Stunned (that code exists and does work)

Saturday, January 9th, 2010
#define READ_ALL		1	/* openfrm: Read all parameters */
#define EXTRA_RECORD		8	/* Reservera plats f|r extra record */

and later on….

  if (prgflag & (READ_ALL+EXTRA_RECORD))
    records++;

Feel free to think about that for a second.

(I have an urge to add this to questions asked in a job interview…)

stringstream is completely useless (and why C++ should have a snprintf)

Tuesday, August 4th, 2009
  1. It’s easy to screw up thread safety.
    If you’re trying to format something for output (e.g. leading zeros, only 1 decimal place or whatever… you know, format specifiers in printf) you are setting a property on the stream, not on what you’re converting. So if you have a thread running that sets a format, adds something to the stream, and then unsets the format, you cannot have another thread able to come in and do something to that stream. Look out for thread unsafe cout code.
  2. You cannot use streams for any text that may need to be translated.
    gettext is what everybody uses. You cannot get a page into the manual before it tells you that translators may want to change the order of what you’re printing. This goes directly against stringstream.
  3. You need another reason? Number 2 rules it out for so much handling it’s not funny.

Progress in nofrm branch

Friday, January 30th, 2009

“Ban FRM Now!” branch in Launchpad

Now we’re reading part of the table information out of the proto file on disk instead of the frm.

Not everything (yet) but a bit. Good first steps. Had to fix bugs along the way as well (and find weirdness in FRM file format…).

Progress is being made.

magic number super fun happy time

Thursday, January 8th, 2009

umm…..

int Field_timestamp::store(double nr)
{
  int error= 0;
  if (nr < 0 || nr > 99991231235959.0)
  {
    set_datetime_warning(DRIZZLE_ERROR::WARN_LEVEL_WARN,
                         ER_WARN_DATA_OUT_OF_RANGE,
                         nr, DRIZZLE_TIMESTAMP_DATETIME);
    nr= 0;					// Avoid overflow on buff
    error= 1;
  }
  error|= Field_timestamp::store((int64_t) rint(nr), false);
  return error;
}

(likely the same in mysql as well… haven’t checked though). these date and time things scare me.

Drizzle progress… (testing can be good)

Tuesday, December 23rd, 2008

We’ve been working on fixing up the remaining test cases so that they run with Drizzle. We’ve found: bugs in Drizzle, bugs in MySQL (one that seems to have been there for at least 10 years), bugs in the tests, tests that no longer apply and occationally, something like this:


/* Please god, will someone rewrite this to be readable :( */
if (to->pack_length() == from->pack_length() &&
!(to->flags & UNSIGNED_FLAG && !(from->flags & UNSIGNED_FLAG)) &&
to->real_type() != DRIZZLE_TYPE_ENUM &&
(to->real_type() != DRIZZLE_TYPE_NEWDECIMAL || (to->field_length == from->field_length && (((Field_num*)to)->dec == ((Field_num*)from)->dec))) &&
from->charset() == to->charset() &&
to->table->s->db_low_byte_first == from->table->s->db_low_byte_first &&
(!(to->table->in_use->variables.sql_mode & (MODE_NO_ZERO_DATE | MODE_INVALID_DATES)) || (to->type() != DRIZZLE_TYPE_DATE && to->type() != DRIZZLE_TYPE_DATETIME)) &&
(from->real_type() != DRIZZLE_TYPE_VARCHAR || ((Field_varstring*)from)->length_bytes == ((Field_varstring*)to)->length_bytes))
{ // Identical fields
/* This may happen if one does 'UPDATE ... SET x=x' */
if (to->ptr != from->ptr)
memcpy(to->ptr,from->ptr,to->pack_length());
return 0;
}

and no, I haven’t really changed the formatting.

libmallocfail

Saturday, November 8th, 2008

Bazaar branches of libmallocfail

Simple LD_PRELOAD library that will take parameters via environment variables and cause malloc() to occationally fail.

Aim was to use this to test bits of MySQL/Drizzle although since their libtool based stuf, the binary in tree is a libtool shell script, and I haven’t found a way to LD_PRELOAD only for mysqld and not the shell script and the other processes spawned by it.

I have found a bug in libc though :)

MemberDB speed improvements

Friday, January 11th, 2008

So I finally installed the xdebug PHP extension and started doing some performance analysis of MemberDB using xdebug and kcachegrind. The upshot of which is a number of commits to the bzr tree that dramatically improve performance in several key areas. The answer? Caching.

I’m not even talking using memcached or caching things in database tables or anything like that – just about everything is still the same dynamically produced content as before, but I’m now caching some simple things avoiding many round-trips to the database while executing a script.

There were a few things that were taking a fair bit of execution time:

  1. The generation of the menu. In MemberDB, there’s a menu on the left. There’s also a powerful (read: non-trivial) permissions system allowing relatively fine grained granting of permissions. So, we need to check that the user has permission to go to the page before showing the page in the menu.
    Previously, for each item in the menu, we’d do a lookup to the database – checking if they have the permission or they are an admin. This ended up taking a bit of time – up to 30% of the time for the front page was taken up just generating the menu!
    So, now I cache the set of permissions for the user. One function to fetch it from the DB into a structure, another function to check the permissions of the user in that struct.
    While testing this, I actually used memcached to cache the menu to see how much of an improvement I could get… I’m about 69/70ths of the speed of using memcached with a purely PHP implementation caching the permissions info.
  2. Getting the information about a member is done in a variety of places. On some pages, you want information on the current logged in user (or just need to find their member ID). These are now cached for the duration of the script. Saved quite a few DB round trips
  3. When viewing an election (not the results, just the normal “view election” page that lists candidates), we need to get the membership information on a number of users (okay… so technically I should rewrite some of the queries to use joins in the DB… but this was easier). I now have a (limited) cache of membership info. So now, when a member has nominated multiple people, we only pull the member info out of the database once.
  4. Rewrite the “current_members” view. The old one was not as efficient as it could be. While the new one has slightly different semantics (can have duplicate rows, it turns out the use of DISTINCT was adding a bit of execution time, which for a bunch of queries is not needed) it’s significantly quicker.

I used the faithful Apache Bench (ab) to do benchmarks against the modified PHP code. I think the biggest improvement was the view election page which went from about 6seconds/page to 0.2seconds/page.

libeatmydata

Tuesday, November 27th, 2007

Following my successful linux.conf.au talk “Eat My Data: How Everybody Gets POSIX File I/O Wrong“, I started to feel the need to easily be able to have my data eaten.

Okay, not quite. However, when you’ve written your software properly, so it uses fsync() correctly, opening files with O_SYNC or whatever – tests take longer as you’re having to wait for things to hit the rust.

So….. LD_PRELOAD=libeatmydata.so to the rescue! With a POSIX compliant fsync() (that does nothing) and filtering on open(2), it can take your test run times down dramatically.

The only time you shouldn’t use it for your tests is when you end up crashing the machine to test durability (i.e. when the OS doesn’t have the opportunity to cleanly write out the data to disk).

See the libeatmydata project page: http://www.flamingspork.com/projects/libeatmydata/

and the bazaar repository: http://www.flamingspork.com/src/libeatmydata

(it’s seemed to have saved somewhere between 20 and 30% of the time for innodb/ndb tests in mysql-test-run).

SCM performance

Monday, June 4th, 2007

Linus is right when he talks about the performance of SCMs…. and that BitKeeper was about the first one to be worth using at all (really).

But as an interesting speed comparison… I’ve managed to pull the latest git (with git) and build it in less time than BitKeeper has taken to pull the latest NDB tree…. hrrm..

One of the reasons I’m so enjoying quilt for every day hacking is that it is blindingly fast.

adding a pluggable information schema table to a pluggable engine in mysql 5.1

Friday, April 27th, 2007

Also now up is the patch series in my “ndb-work” tree which small patch for adding INFORMATION_SCHEMA.NDB_NODE_STATUS. It’s nearly useful… I haven’t brought in the nice “id to string” functions in the management client that make pretty printing nice… so not quite end user friendly :)

But it’s a nice patch to learn how to add an INFORMATION_SCHEMA table in a pluggable engine and put some engine specific information in it.

(kudos to the falcon code… which i looked at on how to do it).

Doesn’t take long – this was completed in less than 2hrs while watching and paying attention to sessions…. so should take next to no time if you actually concentrate on it.

Of course, this totally abuses the purity of the information schema.

Experimental NDB Patches

Friday, April 27th, 2007

I’ve just put up the current “add node” patch… which is like, totally experimental and kills kittens… but could be interesting for people to have a look at as it progresses. Still lots of work before production ready – but people here at the MySQL Conf have said they’re interested in looking at the code for it.

You can grab a combined patch or the quilt series from:

http://saturn.flamingspork.com/~stewart/ndb-experimental/

Applies to 5.1… at least on a few weeks ago tree.

My all time favourite C++ Class

Thursday, November 2nd, 2006
class Guard {
public:
Guard(NdbMutex *mtx) : m_mtx(mtx) { NdbMutex_Lock(m_mtx); };
Guard(NdbLockable & l) : m_mtx(l.m_mutex) { NdbMutex_Lock(m_mtx);
};
~Guard() { NdbMutex_Unlock(m_mtx); };
private:
NdbMutex *m_mtx;
};

DaveM on Ingo’s SMP lock validator

Wednesday, May 31st, 2006

DaveM talks about Ingo’s new SMP lock validator for linux kernel

A note reminding me to go take a look and see what can be ripped out and placed into various bits of MySQL and NDB. Ideally, of course, it could be turned into a LD_PRELOAD for pthread mutexes.

Anybody who wants to look deeper into it before I wake up again is welcome to (and tell me what they find)