Video of my Percona Live Talk: Why would I run MySQL/MariaDB on POWER anyway?

Good news everyone! There’s video up for the talk I gave at Percona Live in April 2016 up: Why would I run MySQL/MariaDB on POWER anyway?

The talk is a general overview of POWER and why MySQL/MariaDB may be a good fit.

MariaDB & Trademarks, and advice for your project

I want to emphasize this for those who have not spent time near trademarks: trademarks are trouble and another one of those things where no matter what, the lawyers always win. If you are starting a company or an open source project, you are going to have to spend a whole bunch of time with lawyers on trademarks or you are going to get properly, properly screwed.

MySQL AB always held the trademark for MySQL. There’s this strange thing with trademarks and free software, where while you can easily say “use and modify this code however you want” and retain copyright on it (for, say, selling your own version of it), this does not translate too well to trademarks as there’s a whole “if you don’t defend it, you lose it” thing.

The law, is, in effect, telling you that at some point you have to be an arsehole to not lose your trademark. (You can be various degrees of arsehole about it when you have to, and whenever you do, you should assume that people are acting in good faith and just have not spent the last 40,000 years of their life talking to trademark lawyers like you have).Basically, you get to spend time telling people that they have to rename their product from “MySQL Headbut” to “Headbut for MySQL” and that this is, in fact, a really important difference.

You also, at some point, get to spend a lot of time talking about when the modifications made by a Linux distribution to package your software constitute sufficient changes that it shouldn’t be using your trademark (basically so that you’re never stuck if some arse comes along, forks it, makes it awful and keeps using your name, to the detriment of your project and business).

If you’re wondering why Firefox isn’t called Firefox in Debian, you can read the Mozilla trademark policy and probably some giant thread on debian-legal I won’t point to.

Of course, there’s ‘ MySQL trademark policy and when I was at Percona, I spent some non-trivial amount of time attempting to ensure we had a trademark policy that would work from a legal angle, a corporate angle, and a get-our-software-into-linux-distros-happily angle.

So, back in 2010, Monty started talking about a draft MariaDB trademark policy (see also, Ubuntu trademark policy, WordPress trademark policy). If you are aiming to create a development community around an open source project, this is something you need to get right. There is a big difference between contributing to a corporate open source product and an open source project – both for individuals and corporations. If you are going to spend some of your spare time contributing to something, the motivation goes down when somebody else is going to directly profit off it (corporate project) versus a community of contributors and companies who will all profit off it (open source project). The most successful hybrid of these two is likely Ubuntu, and I am struggling to think of another (maybe Fedora?).

Linux is an open source project, RedHat Enterprise Linux is an open source product and in case it wasn’t obvious when OpenSolaris was no longer Open, OpenSolaris was an open source product (and some open source projects have sprung up around the code base, which is great to see!). When a corporation controls the destiny of the name and the entire source code and project infrastructure – it’s a product of that corporation, it’s not a community around a project.

From the start, it seemed that one of the purposes of MariaDB was to create a developer community around a database server that was compatible with MySQL, and eventually, to replace it. MySQL AB was not very good at having an external developer community, it was very much an open source product and not a an open source project (one of the downsides to hiring just about anyone who ever submitted a patch). Things struggled further at Sun and (I think) have actually gotten better for MySQL at Oracle – not perfect, I could pick holes in it all day if I wanted, but certainly better.

When we were doing Drizzle, we were really careful about making sure there was a development community. Ultimately, with Drizzle we made a different fatal error, and one that we knew had happened to another open source project and nearly killed it: all the key developers went to work for a single company. Looking back, this is easily my biggest professional regret and one day I’ll talk about it more.

Brian Aker observed (way back in 2010) that MariaDB was, essentially, just Monty Program. In 2013, I did my own analysis on the source tree of MariaDB 5.5.31 and MariaDB 10.0.3-ish to see if indeed there was a development community (tl;dr; there wasn’t, and I had the numbers to prove it).If you look back at the idea of the Open Database Alliance and the MariaDB Foundation, actually, I’m just going to quote Henrik here from his blog post about leaving MariaDB/Monty Program:

When I joined the company over a year ago I was immediately involved in drafting a project plan for the Open Database Alliance and its relation to MariaDB. We wanted to imitate the model of the Linux Foundation and Linux project, where the MariaDB project would be hosted by a non-profit organization where multiple vendors would collaborate and contribute. We wanted MariaDB to be a true community project, like most successful open source projects are – such as all other parts of the LAMP stack.

….

The reality today, confirmed to me during last week, is that:

Those in charge at Monty Program have decided to keep ownership of the MariaDB trademark, logo and mariadb.org domain, since this will make the company more valuable to investors and eventually to potential buyers.

Now, with Monty Program being sold to/merged into (I’m really not sure) SkySQL, it was SkySQL who had those things. So instead of having Monty Program being (at least in theory) one of the companies working on MariaDB and following the Hacker Business Model, you now have a single corporation with all the developers, all of the trademarks, that is, essentially a startup with VC looking to be valuable to potential buyers (whatever their motives).

Again, I’m going to just quote Henrik on the us-vs-them on community here:

Some may already have observed that the 5.2 release was not announced at all on mariadb.org, rather on the Monty Program blog. It is even intact with the “us vs them” attitude also MySQL AB had of its community, where the company is one entity and “outside community contributors” is another. This is repeated in other communication, such as the recent Recently in MariaDB newsletter.

This was, again, back in 2010.

More recently, Jeremy Cole, someone who has pumped a fair bit of personal and professional effort into MySQL and MariaDB over the past (many) years, asked what seemed to be a really simple question on the maria-discuss mailing list. Basically, “What’s going on with the MariaDB trademark? Isn’t this something that should be under the MariaDB foundation?”

The subsequent email thread was as confusing as ever and should be held up as a perfect example about what not to do. Some of us had by now, for years, smelt something fishy going on around the talk of a community project versus the reality. At the time (October 2013), Rasmus Johansson (VP of Engineering at SkySQL and Board Member of MariaDB foundation) said this:

The MariaDB Foundation and SkySQL are currently working on the trademark issue to come up with a solution on what rights to the trademark each entity should have. Expect to hear more about this in a fairly near future.

 

MariaDB has from its beginning been a very community friendly project and much of the success of MariaDB relies in that fact. SkySQL of course respects that.

(and at the same time, there were pages that were “Copyright MariaDB” which, as it was pointed out, was not an actual entity… so somebody just wasn’t paying attention). Also, just to make things even less clear about where SkySQL the corporation, Monty Program the corporation and the MariaDB Foundation all fit together, Mark Callaghan noticed this text up on mariadb.com:

The MariaDB Foundation also holds the trademark of the MariaDB server and owns mariadb.org. This ensures that the official MariaDB development tree<https://code.launchpad.net/maria> will always be open for the MariaDB developer community.

So…. there’s no actual clarity here. I can imagine attempting to get involved with MariaDB inside a corporation and spending literally weeks talking to a legal department – which thrills significantly less than standing in lines at security in an airport does.

So, if you started off as yay! MariaDB is going to be a developer community around an open source project that’s all about participation, you may have even gotten code into MariaDB at various times… and then started to notice a bit of a shift… there may have been some intent to make that happen, to correct what some saw as some of the failings of MySQL, but the reality has shown something different.

Most recently, SkySQL has renamed themselves to MariaDB. Good luck to anyone who isn’t directly involved with the legal processes around all this differentiating between MariaDB the project, MariaDB Foundation and MariaDB the company and who owns what. Urgh. This is, in no way, like the Linux Foundation and Linux.

Personally, I prefer to spend my personal time contributing to open source projects rather than products. I have spent the vast majority of my professional life closer to the corporate side of open source, some of which you could better describe as closer to the open source product end of the spectrum. I think it is completely and totally valid to produce an open source product. Making successful companies, products and a butt-ton of money from open source software is an absolutely awesome thing to do and I, personally, have benefited greatly from it.

MariaDB is a corporate open source product. It is no different to Oracle MySQL in that way. Oracle has been up front and honest about it the entire time MySQL has been part of Oracle, everybody knew where they stood (even if you sometimes didn’t like it). The whole MariaDB/Monty Program/SkySQL/MariaDB Foundation/Open Database Alliance/MariaDB Corporation thing has left me with a really bitter taste in my mouth – where the opportunity to create a foundation around a true community project with successful business based on it has been completely squandered and mismanaged.

I’d much rather deal with those who are honest and true about their intentions than those who aren’t.

My guess is that this factored heavily into Henrik’s decision to leave in 2010 and (more recently) Simon Phipps’s decision to leave in August of this year. These are two people who I both highly respect, never have enough time to hang out with and I would completely trust to do the right thing and be honest when running anything in relation to free and open source software.

Maybe WebScaleSQL will succeed here – it’s a community with a purpose and several corporate contributors. A branch rather than a fork may be the best way to do this (Percona is rather successful with their branch too).

Awesome MySQL 5.7 improvements

Recently, I’ve had reason to poke at MySQL performance on some pretty cool hardware. Comparing MySQL 5.6 to MySQL 5.7 is a pretty interesting thing to do when you have many CPU cores.

The improvements to creating read views in InnoDB is absolutely huge for small statements with large concurrency – MySQL 5.7 completely removes this as a bottleneck – as much as doubling maximum SQL queries per second, which is a pretty impressive improvement.

I haven’t poked at the similar improvements in Percona Server on this hardware setup – so I can only really guess as to the performance characteristics of it… If comparing to older MySQL versions, Percona Server 5.5 is likely to outperform MySQL 5.5 thanks to this optimization.

But I have to say… MySQL 5.7 is impressive in its concurrency improvements.

and now for something completely different…

As many of you know, I’ve been working in the MySQL world for quite a while now. IN fact, it was nearly 10 years ago when I first started hacking on MySQL Cluster at MySQL AB.

Most recently, I was at Percona which was a wonderful journey where over my nearly three years there the company at least doubled in size, launched several new software products and greatly improved the quality and frequency of releases.

However the time has come for something completely different. The MySQL world is rather mature, the future of Percona software is bright and, well, I could do with poking into something rather different.

So a couple of weeks ago I started at IBM in the Linux Technology Centre working on KVM on POWER and related things. No doubt there’ll be interesting things to blog about as time goes on, but it’s about time I posted my change of employment :)

Converting MySQL trees to git

I have put up a set of scripts on github: https://github.com/stewartsmith/bzr-to-git-conversion-scripts. Why do I need these? Well… if only bzr fast-export|git fast-import worked flawlessly for large, complex and old trees. It doesn’t.

Basically, when you clone this repo you can run “./sync-BLAH.sh” and it’ll pull BZR trees for the project, convert to git and clean things up a bit. You will likely have to edit the sync-BLAH.sh scripts as I have them pointed at branches on my own machine (to speed up the process, not having to do fresh BZR branches of MySQL trees over the network is a feature – it’s never been fast.). You’ll also want to edit the git remotes to point where you want git trees to end up.

I’ve done it for:

What problems did I hit? Well… the first is performance, things are slow unless you tweak a bunch of knobs, and then it’s just rather slow rather than slow. So in the empty git repo I set core.compression=1, which makes zlib a whole lot faster.

I naturally give the correct incantation to bzr fast-export to munge tag names appropriately, set a git branch name (each BZR branch ends up as a git branch) and use a marks file (this speeds up incremental syncs).

For one of these branches I was importing, BZR had allowed the invalid committer of “billy-earney billy.earney@gmail.com\n <>” – yes, a newline in the committer. This messes up the fast-import format so I have to run the entire fast-export output through sed to clean it up.

We then use bzr fast-import-filter to apply a user map – which is me looking at the appropriate committers and cleaning them up so that we get better attribution in the resulting git trees as well as cleaning up some errors in the bzr tree so that Git likes them (most notably, missing < or (not and) > around email addresses). The user map is fairly Percona specific, but there’s at least one or two for Oracle committers too.

Next, I pass the output through pv(1) – to do two things: monitor the output to see that it’s still going, and to have a transfer buffer so that git fast-import doesn’t stall waiting for output – amazingly enough, this gave a decent speed boost to import speed.

Finally, when we’re done doing the import of all of the revisions for all of the bzr branches, if this is our first run, we set the HEAD ref to the last BZR branch name and then do a git repack. Through experimentation, I’ve found that “git repack -AdfF –depth=100 –window=500” is what gives me the smallest size possible.

My lca2014 talk video: Past, Present and Future of MySQL and variants

On last Wednesday morning I gave my talk at linux.conf.au 2014. You can now view and download the recording of it here:

http://mirror.linux.org.au/linux.conf.au/2014/Wednesday/28-Past_Present_and_future_of_MySQL_and_variants_-_Stewart_Smith.mp4

(hopefully more free formats will come soon, the all volunteer AV team has been absolutely amazing getting things up this quickly).

Hong Kong (OpenStack Summit)

I’ll be in Hong Kong for the upcoming OpenStack Summit Nov 5-8. I’d be thrilled to talk database things with others present, especially around Trove DBaaS (DataBase as a Service) and high availability MySQL for OpenStack deployments.

I was last in Hong Kong in 2010 when I worked for Rackspace. The closest office to me was in Hong Kong so that’s where I did my HR onboarding training. I remember telling friends on the Sunday night before leaving for Hong Kong that I may be able to make dinner later in the week purely depending on if somebody got back to me on if I was going to Hong Kong that week. I was, and I went. I took some photos while there.

Walking from the hotel where we were staying to the Rackspace office could be done pretty much entirely through buildings without going outside. There were bits of art around too, which is just kind of awesome – I’m always in favour of random art.
Statues in walkways

The photo below was the view from my hotel room. The OpenStack summit is just by the airport rather than in the middle of town, so the views will be decidedly different to this, but still probably quite spectacular if you’re around the right place (I plan to take camera gear, so shout if you want to journey too)
Hotel Window (Hong Kong)

There are some pretty awesome markets around Hong Kong offering just about everything you’d want, including a lot just out on the street.
Java Road
Hong Kong Street Market

Nightime was pretty awesome, having people from around the world journey out into the night was great.
Rackers walking Hong Kong at Night

I was there during the World Cup, and the streets were wonderfully decorated. I’m particularly proud of this photo as it was handheld, at night, after beer.
Hong Kong streetlife

Awesome coffee beans from Cartel Coffee Roasters

The other week Leah and I went to the Royal Melbourne Show (she won free tickets which makes it a lot easier to swallow than the $35/head otherwise) and I picked up some coffee beans while there (why not!). These beans are called “The Guji” and are from Cartel Coffee Roasters down in Geelong. I opened them the other day and as an increasing number of my Percona colleagues can attest to, I’ve been raving about them. These are some seriously good beans.

amazing coffee beans

The road to Percona Server 5.6

Over a year ago now, I announced the first Percona Server 5.6 alpha on the Percona MySQL Performance Blog (Announcing Percona Server 5.6 Alpha). That was way back on August 14th, 2012 and it was based on MySQL 5.6.5 released in April.

I’m really happy now to point to the release of the first GA release of Percona Server 5.6 along with some really interesting benchmarks. We’ve certainly come a long way from that first alpha and I’m really happy that we’ve also managed to continue to release Percona Server 5.5 and Percona Server 5.1 releases on time and of high quality.

Over the same time frame that we’ve been working on Percona Server 5.6 we’ve increased the size of the company, improved development practices and grown enough that we’ve reorganised how development of software is managed to make it scale better. One thing I’m really, really pleased about is a culture of quality we’ve managed to nurture.

Keeping a culture of quality alive is something that requires constant nurturing. All too often I’ve seen pressure to ship sooner rather than stabler (yes, I just invented that word), and yes, we initially planned the GA of PS 5.6 earlier than we ended up shipping it, but we instead took the time to round out features and stability to ship something much better.

Now comes the effort of continuing good releases, promoting it and writing a Webinar to give next week.

Pictures of Auckland (where OSDC 2013 is!)

It’s getting close to time to head to Auckland for OSDC and a few days ago I blogged about how I’m speaking there). I’ll be speaking on MySQL In the Cloud, As A Service and all of the challenges that can entail as well as on The Agony and Ecstasy of Continuous Integration. Both of these talks draw heavily on the experience of Percona (my employer) and with experience from helping customers with all sorts of MySQL deployments and in our experience in producing our own high quality software.

I was in Auckland earlier this year, so thought I’d share some pictures of the wonderful city in which OSDC is being held.

Firstly, New Zealand has some pretty awesome wildlife. This is possibly not the best example of it ever as there are way more odd looking birds than this one:

Auckland

The waterfront is quite nice, and when we were there earlier in the year it was awfully nice weather for it:
Auckland

I’m pretty sure there isn’t going to be a triathlon in Auckland for OSDC, but I’m still hoping to get out for a run while there (anybody else up for one?). We left home at something like 3:30 in the morning and got some silly early flight (6am or before) and were totally walking around the city a little like zombies, realising that we simultaneously wanted to go for a run and sleep.

Auckland Triathlon

We were meeting friends from Seattle and managed to spot this coffee place down by the water. I didn’t try it myself, but I’ve certainly had good coffee at other places in New Zealand.

Seattle coffee in Auckland, New Zealand

Streets at night:

Auckland@dusk

And if I haven’t already convinced you that Auckland would be a great place to be, here’s a crappy cell-phone snapshot of a variety of New Zealand beers – a tiny, tiny fraction of beer you can get in New Zealand (the microbrewery scene is amazing)

A selection of NZ beer

Go register for OSDC 2013 right now: http://osdc.org.nz/tickets/

Speaking at OSDC 2013 in Auckland!

I’ll be speaking at the upcoming OSDC conference in Auckland, New Zealand! It’s on October 21st-23rd and you should go here right now and register. I’m giving two talks at OSDC this year:

  • MySQL in the cloud, As A Service (Monday 21st, 12:00pm)
    There is no one magic solution to having MySQL As A Service work well, it’s a lot of small moving parts and options that need to be set, monitored and configured. We may wish it was different, or look at other database technologies, but there is a lot of legacy code that talks to MySQL, with all it’s idiosyncrasies – and we need to be able to support this code. In this talk, we’ll cover many of the problem areas and what you can do to avoid them.
  • The Agony and Ecstasy of Continuous Integration (Wednesday 23rd, 2:30pm)
    This a tale of the introduction of continuous integration testing into a well established development team. It covers both the highs and lows and discusses strategies to deal with both the positives and negatives and in turn improve your own software engineering practices.

In case you need to quickly justify to your boss why you should go to OSDC, the conference organisers have helpfully provided a page of hints on just that subject.

The end of Bazaar

I’ve used the Bazaar (bzr) version control system since roughly 2005. The focus on usability was fantastic and the team at Canonical managed to get the entire MySQL BitKeeper history into Bazaar – facilitating the switch from BitKeeper to Bazaar.

There were some things that weren’t so great. Early on when we were looking at Bazaar for MySQL it was certainly not the fastest thing when it came to dealing with a repository as large as MySQL. Doing an initial branch over the internet was painful and a much worse experience than BitKeeper was. The work-around that we all ended up using was downloading a tarball of a recent Bazaar repository and then “bzr pull” to get the latest. This was much quicker than letting bzr just do it. Performance for initial branch improved a lot since then, but even today it’s still not great – but at least it isn’t terrible like it once was.
The integration with Launchpad was brilliant. We never really used it for MySQL but for Drizzle the combination was crucial and helped us get releases out the door, track tasks and bugs and do code review. Parts of launchpad saw great development (stability and performance improved immensely) and others did not (has anything at all changed in blueprints in the past 5+ years?). Not running your own bugs db was always a win and I’m really sad to say that I still think Launchpad is the best bug tracker out there.
For both Drizzle and Percona, Bazaar was the right option as it was what MySQL was using, so people in the community already knew the tools. These days however… Git is the tool that there’s large familiarity with – even to the extent that Twitter maintains their MySQL branch in Git rather than in bzr.Is Bazaar really no longer being developed? Here are graphs (from github actually) on the activity on Bazaar itself over the years:Screenshot from 2013-10-02 10:32:19Screenshot from 2013-10-02 10:33:41You can easily see the drop off in commits and code changes. The last commit to trunk was 2 months ago and although there was the 2.6.0 release in August, in my opinion it wasn’t a very strong one (the first one I’ve had problems with in years).So… git is the obvious successor and with such a strong community around GitHub, it kinda makes sense. I’m not saying that GitHub has caught up to Launchpad in terms of features or anything – it’s just that with Bazaar clearly no longer really being developed…. it may be the only option.In fact, in my experiment of putting a mirror of Percona Server on GitHub, we already have a pull request mere days after I blogged about it. Migrating all of Percona development over to Git and Github may take some time, but it’s certainly time that we kicked the tyres on it and worked out how we’d do it without interrupting releases or development.I’ve also thrown up a Drizzle tree and although it required some munging to get the conversion to happen, I’m kind of optimistic about it and I think that after a round of merging things, I’m tempted to very strongly advocate for us switching (which I don’t think there’ll be any opposition to).When will Oracle move over their MySQL development? This I cannot say (as I don’t know and don’t make that call for them). There is a lot of renewed interest in code contribution by Oracle and moving to Git and GitHub may well be a very good way to encourage people.
The downside of git? Well… With BZR you could get away with not understanding pretty much every single bit of the internals. With git, I wish I was so lucky.

Are MariaDB tests adding anything extra over Oracle MySQL tests?

I grabbed all the tests introduced in MariaDB 5.5.32 (i.e. “bzr diff -rtag:mariadb-5.5.31..mariadb-5.5.32 mysql-test/” and some foo) and threw them in their own test file. I only kept tests for crashing bugs and ignored those that required plugins (there were two or three, but nothing major). So now I have a test file that should crash MariaDB 5.5.31 and probably before. But, the question is: does this crash Percona Server or MySQL?

While it is excellent to see the MariaDB guys including tests for their crashing bugs, are these MariaDB specific or do they affect other MySQL flavours?

I built a release build of top of trunk Percona Server and ran the test against it. I got no crashes. In a debug build, I got two. One was to do with REPAIR on an ARCHIVE table and the other was “SELECT UNIX_TIMESTAMP(STR_TO_DATE(‘2020′,’%Y’));”. I found the same thing for a debug build of top of tree MySQL.

All the other tests for crashing bugs, of which there were 14 – were MariaDB specific. So, out of 16 total, only 2 applied to Percona Server and MySQL.

Detecting if a MySQL server supports partitioning

This morning, this Percona XtraBackup bug came to my attention: https://bugs.launchpad.net/bugs/1170340 – basically, it’s now really quite tricky to determine if a MySQL server you’re connected to supports partitioning or not.

If you’re connected to anything less than MySQL 5.6, you can use have_partitioning variable. But since that’s gone in 5.6, you’re going to get a false negative if you’re connected to 5.6. You could use INFORMATION_SCHEMA.PLUGINS table, but that’s not there in 5.0, so you have some added workarounds to add there too.

A simple version check could be the solution… but what if you compiled the server without partitioning support?

Impact of MySQL slow query log

So, what impact does enabling the slow query log have on MySQL?

I decided to run some numbers. I’m using my laptop, as we all know the currently most-deployed database servers have mulitple cores, SSDs and many GB of RAM. For the curious: Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz

The benchmark is going to be:
mysqlslap -u root test -S var/tmp/mysqld.1.sock -q 'select 1;' --number-of-queries=1000000 --concurrency=64 --create-schema=test

Which is pretty much “run a whole bunch of nothing, excluding all the overhead of storage engines, optimizer… and focus on logging”.

My first run was going to be with the slow query log on. I’ll start the server with mysql-test-run.pl as it’s just easy:
eatmydata ./mysql-test-run.pl --start-and-exit --mysqld=--slow-query-log --mysqld=--long-query-time=0

The results? It took 18 seconds.

How long without the slow query log (starting with mysql-test-run.pl again, but this time without any of the extra mysqld options)? 13 seconds.

How does this compare to a Drizzle baseline? On a freshly build Drizzle trunk, using the same mysqlslap binary I used above, connecting via UNIX socket: 8 seconds.

New Jenkins Bazaar plugin release

I’ve just uploaded version 1.20 of the Bazaar plugin for Jenkins. This release is based on feedback from users and our experiences at Percona.

  • Do a lightweight checkout instead of a heavyweight checkout (if “Checkout” is enabled)
  • Fix bug: lightweight checkout “update” would always fail as bzr update didn’t accept a repository argument. Switch to using bzr update followed by bzr switch. This should massively improve performance for those not doing a full branch.
  • Remove “Clean Branch” advanced option (replaced with “Clean Tree” option)
  • Add a “Clean Tree” advanced option. This will run “bzr clean-tree –quiet –ignored –unknown –detritus”, preserving the .bzr directory but doing the equivalent of wiping the workspace (starting with a fresh slate). This should massively improve performance for projects that do not have a clean build.
  • Clarify that Loggerhead is the repository browser used by Launchpad, and have a complete example of how to configure it.

Jenkins Bazaar plugin 1.19

I recently released a new version of the Bazaar plugin for Jenkins. This release was inspired by a problem we noticed at Percona. It is:

  • run “bzr revert” after a pull, as if you have a directory that is removed and re-added while having unknown files in said directory (e.g. build artifacts), you would end up in a very bad place (this is a BZR bug, so we work-around it with a “bzr revert”).

The update has already appeared in the Jenkins update centre, so you should already be able to upgrade to it.

New Jenkins Bazaar plugin release! 1.18

From the desk of your new Bazaar plugin for Jenkins maintainer, I give you Version 1.18.

This release has two good bug fixes:

  • UI fix for checkout option (JENKINS-12261)
  • Auto-recover from corrupt BZR branches (e.g. bzr branch/checkout killed at inopportune moment) by cleaning the workspace and trying again (this is now default behaviour, best used with the Jenkins SCM retry count feature being > 1)

We’ve been running the same code as this release at Percona for about 2 months now (the second bugfix was one I wanted to test first before submitting upstream). This is the big fix that fixed all our problems with using bazaar with Jenkins in a large deployment.

The other news? I’m now maintainer, and this is my first release.

The page on the Jenkins wiki is here:

and updates should come through the standard Jenkins channels as all the auto-foo happens.

Hacking the Jenkins BZR plugin

For Drizzle and for all of the projects we work on at Percona we use the Bazaar revision control system (largely because it’s what we were using at MySQL and it’s what MySQL still uses). We also use Jenkins.

We have a lot of jobs in our Jenkins. A lot. We build upstream MySQL 5.1, 5.5 and 5.6, Percona Server 5.1, Percona Server 5.5, XtraBackup 1.6, 2.0 and 2.1. For each of these we also have the normal trunk builds as well as parameterised ones that allow a developer to test out a tree before they ask for it to be merged. We also have each of these products across seven operating systems and for each of those both x86 32bit and 64bit. If we weren’t already in the hundreds of jobs, we certainly are once you multiply out between release and debug and XtraBackup being across so many MySQL and Percona Server versions.

I honestly would not be surprised if we had the most jobs of any user of the Bazaar plugin to Jenkins, and we’re probably amongst the top few of all Jenkins installations.

So, in August last year we discovered a file descriptor leak in the Bazaar plugin. Basically, garbage collection doesn’t get kicked off when you run out of file descriptors. This prevented us from even starting back up Jenkins until I found and fixed the bug. Good times.

We later hit a bug that was triggered in the parallel loading of jobs during startup. We could get stuck in an infinite loop during Jenkins starting that would just eat CPU and get nowhere. Luckily Jenkins provides a workaround: specify “-Djenkins.model.Jenkins.parallelLoad=false” as an argument and it just does it single threaded. For us, this solves that problem.

We were also hitting another problem. If you kill bzr at just the wrong time, you can leave the repository in not an entirely happy state. An initial branch can be killed at a time where it’ll think it’s a repository rather than a checkout and there’s a bunch of other weirdness (including file system corruption if you happen to use bad VM software).

The way we were solving this was to sometimes go and “clean workspace” on the jobs that needed it (annoying with matrix builds). We’d switched to just doing “clean tree” for a bunch of builds. The problem with doing a clean tree was that “bzr branch” to check out the source code could take a very long time – especially for Percona Server which is a branch of MySQL and hence has hundreds of megabytes of history.

We couldn’t use bzr shared repositories as we kept hitting concurrency bugs when more than one jenkins job was trying to do a bzr operation at the same time (common when matrix builds kick off builds for release and debug for example).

So.. I fixed that in the Jenkins bazaar plugin too (which should be in an upcoming release) and we’ve been running it on our Jenkins instance for the past ~2 months.

Basically, if we fail to check out the Bazaar tree, we wipe it clean and try again (Jenkins has a “retry count” for source checkouts). This is a really awesome form of self healing. Even if the bazaar team fixed all the bugs, we’d still have to go and get that new version of bzr on all our build machines – including ancient systems such as CentOS 5. Not as much fun as bashing your head into a vice.

After all of that, I seem to now be the maintainer of the Bazaar plugin for Jenkins as Monty pointed out I was using it a lot more than him and kept finding and fixing bugs.

Soooo… say hello to the new Jenkins Bazaar plugin maintainer, me.

Yes, I maintain Java code now. Be afraid. Be very afraid.

Sessions at the Percona Live MySQL Conference that interest me

For the past many years, there’s been a conference in April, at the Santa Clara Convention Centre where the topic has been MySQL and the surrounding ecosystem. The first year I went, I gave a talk on the new features in MySQL Cluster 5.1 to a overflowing room of attendees. For me, it’s an event that’s mixed with speaking about something I’ve been working on and talking to other attendees about everything from how a particular part of the server works to where we can escape to for nearby good vegan food.

So, I thought I’d share some of the sessions that I’m really looking forward to. My selection is probably atypical, but may be interesting to others. I’m not going to list the keynotes, although they are often of a lot of value. I’m also going to attempt to avoid listing a few really awesome well known speakers simply because there are other really interesting sessions that also need exposure!

  • Starring Sakila: Building Data Warehouses and BI solutions using MySQL and Pentaho
    I need to base decisions off data, not simply a gut feeling (I’m not Stephen Colbert after all). I ran into a bunch of stumbling blocks when trying to work with Pentaho a couple of weeks ago, and I’m really hoping that this session shines some light on how to use it to better and more easily make arguments based on evidence to others in the company.
  • Testing MySQL Databases: The State Of The Art
    I’ve worked with Patrick for several years now, and he’s currently a valuable member of my team at Percona. For those who are interested in the state of the art of open source database testing, this is the session to be in.
  • Getting InnoDB Compression Ready for Facebook Scale
    This session is on at the same time as I’m speaking, so I probably won’t be able to attend (people keep coming to my sessions so I usually can’t sneak out). I’m really interested in how they’ve modified the compression code to help with their (large) workload.
  • Backing Up Facebook
    I hear that Facebook has a couple of database servers, a few dozen users and a few floppy disks full of data. This should be a fun story :)
  • Introducing XtraBackup Manager
    Being responsible for XtraBackup development at Percona, the XtraBackup topics really interest me. Lachlan has been working on a simple backup manager for XtraBackup to help create something that is a more complete backup solution than a tool which simply creates a backup.
  • Extending Xtrabackup – A Point-In-Time System
    Another good case of using XtraBackup as part of a comprehensive backup strategy. I have to be honest, I’m looking for ways in which we can improve XtraBackup to better fit the needs of people. It may be that there are a few small things we can do to make it easier for people do deploy and use.
  • Getting Started with Drizzle 7.1
    We’re about to do the 7.1 release of Drizzle! If you’re interested in having a SQL database that is designed to be used in large scale web applications and cloud environments, come along to this talk.
  • MySQL Idiosyncrasies That Bite
    I have to admit, I’m interested in Ronalds talk here to basically ensure we didn’t miss fixing anything in Drizzle. I do promise not to at any point yell out “Fixed in Drizzle” though.

Go here to register: http://www.percona.com/live/mysql-conference-2012/ (early bird pricing and discounted hotel rooms end March 12th, so you want to register sooner rather than later).