Archive for the 'mysql' Category

SetFileValidData Function (Windows) - Now with added FAIL

Monday, September 8th, 2008

SetFileValidData Function (Windows)

There seems to be two options on Win32 for preallocating disk space to files.

Basically, I want a equivilent to posix_fallocate or the ever wonderful xfsctl XFS_IOC_RESVSP64 call.

The idea being to (quickly) create a large file on disk that is stored efficiently (i.e. isn’t fragmented).

From SQL, you’d do something like “CREATE LOGFILE GROUP lg1 ADD UNDOFILE ‘uf1′ INITIAL_SIZE 1G;” and expect a 1GB file on disk. One way of getting this is calling write() (or WriteFile() on Win32) repeatedly until you’ve written a 1GB file full of zeros. This means you’re generating approximately 1GB of IO.

Except it’s worse than that: every time you extend the file, you’re going to be changing the metadata (file and free space information). If you’re lucky, you won’t be using a file system that writes a new transaction to the journal for each time you do this.

If your file system allocator doesn’t like you today (even more likely when you’ve got more than one process doing IO), you may end up with rather fragmented files as well - especially if you’re doing synchronous IO. So you want some method of saying “this file will be size X, please allocate disk space to it in the most efficient way for a file size of X” as it’s not possible to infer this from everyday IO calls (I guess the Win32 CopyFile and CopyFileEx calls could though).

It probably doesn’t do it, but having a CopyFile call would be neat for copy on write file systems and saving space… although I wonder how many Win32 apps would cope with ENOSPC on a write to an existing part of a file.

On IRIX we used the magic xfsctl() with the XFS_IOC_RESVSP64 argument. On Linux (with XFS), we use the same. On ext2/ext3 the only way to get the same has been to (with the file system unmounted), parse the file system and implement it yourself. Although (and this just in) the brand new fallocate() call should help with this. The posix_fallocate() call in GNU libc has just been a wrapper around the simple method of writing 0 to a file from start to end (albeit rather efficiently).

XFS implements something called “unwritten extents”. An unwritten extent says “this range of blocks is allocated to this file. If reading from this range, return a zero page. If writing, split the unwritten extent into 3 parts: before, the newly written extent (which isn’t unwritten: i.e. now valid data), and the after extent.” Simple, rather efficient and gets really good allocation as XFS gets to search the free space btrees based on size.

So what to do on Win32 (apart from drink heavily to try and make it all go away)?

There’s SetFileValidData, but that needs special permissions and may expose previously deleted data from other users. i.e. massive security hole. FAIL

There’s SetEndOfFile which, quoting the MS docs: “If the file is extended, the contents of the file between the old end of the file and the new end of the file are not defined.” Not exactly reassuring… but introduced in W2k, so rather safe to use today. Doesn’t save you from having to fill the file with zeros as part of initialisation though.

There’s SetFileInformationByHandle, which looks like it may do exactly what I want… if you read between the lines of the documentation. But it’s only supported starting with Vista. Which you all use of course, so that’s not a problem.

Building MySQL on Windows - MySQL Forge Wiki

Monday, September 8th, 2008

Building MySQL on Windows - MySQL Forge Wiki

This one covers running mysqld in the VisualStudio debugger, which can be useful.

I have no special ndb_mgmd.exe or ndbd.exe in debugger instructions or wisdom (running them from mysql-test-run.pl at least). I’ve attached debugger to already running (started by mysql-test-run.pl) ndb processes, but haven’t made any changes to mtr to make it like the mysqld of “go and enter this”.

MySQL Conference & Expo 2009 - CFP open

Friday, September 5th, 2008

Is it that time already? MySQL Conference & Expo 2009 has opened the CFP.

Submit (well) early and often. It’s always an exciting (and exhausting) conf. Good technical, relevant content is what makes it good. Getting to talk to people who do amazing things, people who use your software, people looking to use it, people who want to chat about how you can learn off each other.

Any suggestions for what you’d like to hear from me (Cluster, Drizzle et al) are welcome - either via private mail or comments here.

when the problem is likely a bug in the linker…

Thursday, August 28th, 2008

Windows FAIL.

It has been suggested the current thing I’m trying to fix is actually a bug in the Microsoft linker…. and I’m quite willing to believe that.

I wonder if I can expense rehab if this Windows port leads to a drinking problem….

Building MySQL Cluster on Windows (for Windows)

Wednesday, August 27th, 2008

You will need:

  • CMake (at least 2.4.7)
  • Bazaar (the newer the better - 1.6 was just released - at least use that)
  • Gnu Bison
  • Visual Studio (Express works, but I’m talking about 2005 here)
  • … and all this installed on a Microsoft Windows machine.
  • … and to hate yourself, you are going to be using Windows after all.

Then, get and build it:

  1. Get the source:
    bzr branch lp:~mysql/mysql-server/mysql-5.1-telco-6.4-win
  2. Run CMake. the CMake GUI can now be used to select compile options! You’ll have to set the path “where is the source code” to where you put the source code in step 1.
  3. Hit “Configure” in CMake
  4. Select the target (i.e. the version of Visual Studio you’re going to use)
  5. Select the build options. HINT: WITH_NDBCLUSTER_STORAGE_ENGINE may be a useful one to enable
  6. Hit Configure again
  7. Hit Ok.
  8. CMAKE now generates the Visual Studio project. Use this time to drink some good scotch.
  9. Open Mysql.sln (which should launch Visual Studio)
  10. Go Build -> Build Solution (or hit F7)

Now you can go and have much whisky as this will take a few minutes. You should now have a set of built binaries for MySQL Cluster on Windows. Scary.

ndb_mgm.exe builds (and works) in mysql-5.1-telco-6.4-win

Friday, August 22nd, 2008

“MySQL Cluster 6.4 Windows tree” branch in Launchpad

(which really should have the -fail suffix… but anyway)

In what will (soon) be mirrored to launchpad, all but 17 targets (yeah, working on that… but it’s out of 130 or something) build.

Not only that, I’ve used the management client (ndb_mgm.exe) to monitor the cluster running my Bugzilla instance (which is now a rather old 6.3 build).

Getting closer to NDB on Windows.

Be afraid. Be very, very afraid.

“MySQL Cluster 6.4 Windows tree” branch in Launchpad

Thursday, August 21st, 2008

“MySQL Cluster 6.4 Windows tree” branch in Launchpad

That’s right folks, I’m pushing up patches for MySQL Cluster on Windows. This tree is incomplete, and no promises on when enough will be pushed for it to even compile on Windows.

Tree is updated when launchpad pulls from our internal tree.

Firefox on OpenSolaris fixed (and installed bzr)

Wednesday, August 6th, 2008

Thanks to Glynn for pointing me to the right thread on opensolaris.org (in a comment on my Good adventures with OpenSolaris post). The package verification thingy (pkg verify -v -f SUNWfirefox) did actually throw an error (indicating some sort of problem). So that’s pretty neat. The fact that it got into trouble in the first place isn’t good, but corruption detection is the next best thing.

I still occationally hit the bug in VirtualBox where if you have 127.0.0.1 in your resolv.conf on your host (e.g. running a local caching nameserver), VirtualBox passes this through to the guest, so the guest tries to use the guest 127.0.0.1 as a nameserver - this usually doesn’t work so well.

The good news is, Firefox now works in my OpenSolaris VM.

The bad news is that even though I’ve gone and set my keyboard layout as DVORAK (with the Input Method Switcher applet), whath should be ctrl-l (for location bar) in Firefox, actually brings up the Print dialog (on DVORAK, L is where P is on QWERTY).

But, I’ve managed to download bazaar now, and the install was simple (just follow INSTALL in the bzr tarball). At some point I’ll badger someone to make an OpenSolaris package for it so you could do “pkg install bzr”, but you can’t do that yet.

The next challenge will be to branch repositories from the host onto a temp drive, build and test.

Good adventures with OpenSolaris

Tuesday, August 5th, 2008

First of all, thanks to everyone who commented on my previous OpenSolaris entry (which wasn’t really positive at all).

I recently tried again - this time starting with an ISO of build 93. I’d recommend completely ignoring the 2008.05 release and going straight for the build 93 image.

Installed easily in VirtualBox, adding the VirtualBox extensions was easy. Select “Devices -> Install Guest Additions” in the VirtualBox menu, then when logged into the OpenSolaris install, do the following:

su

pkgadd -d /media/VBOXADDITIONS_1.6.0_30421/VBoxSolarisAdditions.pkg

(you then say yes, i really do want to install it. rather obvious. I had to do this step again after the “pkg image-update” below though). Just logging out and then back in again gets you all the awesomeness you’d expect from running other guests (such as that system released by a large corporation in Redmond).

The “pkg image-update” went as expected, and I’m now running build 94.

I installed SunStudio Express (compilers) pretty easily - “pkg install sunstudio”. Unfortunately, this is all in /opt/SunStudioExpress and not in $PATH, which would have been much more useful. I guess there’s still a bit to go before usability nirvana. Also, no .desktop entries, so have to explicitly run /opt/SunStudioExpress/bin/sunstudio to get the NetBeans gui. Presumably if i add /opt/SunStudioExpress/bin to PATH, building random software packages will be nicer.

So, I then want bzr so i can pull source repositories. Monty Taylor informs me that the magic packages you want are: SUNWgcc, gcc-dv and SUNWtoo. Then you can build bzr as downloaded from the website. Installed these easily.

However, now trying to get the bzr source:
$ firefox
ld.so.1: firefox-bin: fatal: /usr/lib/firefox/libxul.so: corrupt or truncated file

and then symbol kPStaticModules: referenced symbol not found.

So maybe I shouldn’t have upgraded to build 94…..

But certainly in much better shape than the may release, but be warned, it’s still a work-in-progress and some things may sporadically not work from time to time (e.g. like firefox and now).

Hopefully, some time soon I’ll get a MySQL build (well… really I want MySQL Cluster, and later drizzle) going and will really be able to hammer these things with dtrace.

OSCON

Monday, July 21st, 2008

Arrived okay - long travel, but in one piece. Staying at the doubletree.

Adventures with OpenSolaris

Wednesday, July 16th, 2008

So… some colleagues have been experimenting with DTrace a bit, and I’ve been (for a while now) wanting to experiment with it.

The challenge now, instead of in the past, is that I’m setting up a Solaris based system - not getting one premade.

I chose OpenSolaris as I’d previously tried Solaris 10 and just sunk too much time trying to get updates and a development environment installed (another colleague could get the opposite to me going: he got devtools but no updates. at least mine was up to date and secure… but without a compiler).

So… OpenSolaris. It isn’t 100% open, there’s binary only drivers and such… but compared to previous Solaris, a whole lot better. Now, if only it was GPL licensed so we could have cross-pollination with Linux.

I grabbed the 2008.05 ISO as soon (in fact, slightly before) it was released and installed it in VirtualBox.

The installation was shiny - one of the best OS installs I’ve seen in a while. It set up nice things (zfs, X) and (an improvement on the previous release) even managed to get all the hardware going (not sound though).

However, on first reboot, nasty surprise. DNS isn’t enabled by default.

I found out why DNS isn’t enabled by default - and (as usual) this comes down to hysterical raisins. Back in what we laughingly call the past, during install Solaris would ask you what services you wanted to use for name resolution (which I guess made sense when people used yp/NIS more often than DNS). The default didn’t include DNS.

In the graphical installer, it just chose the default without asking… which is no DNS. So my mother would be able to install OpenSolaris, but once done, she’d have to know to type in 150.101.98.214 instead of www.google.com.au into Firefox. However, I swallowed my pride, edited /etc/nsswitch.conf and went along my business (I wonder the percentage of users who would actually go from “hrrm, internet not working” to editing /etc/nsswitch.conf without intense googling).

The UI did look nice though. Nice looking GDM, GNOME desktop looked nice. You could tell that whoever did the theme had spent too much time near MacOS X, but I’ll forgive them for that. The default shell is remotely sane and even though the bash completions aren’t as funky as on Ubuntu, I managed (unlike sitting at cmd.exe, where somebody is likely to die each time my keystrokes end up there).

I even had a look at the graphical package management tool - which looked quite nice. I even tried to do an update via it… which ended in what seemed to be a locked package manager and general amounts of fail. To see if it had just stopped or was chewing up my CPU or memory, I opened a terminal and ran ‘top’.

I then found out that top isn’t installed by default. It’s 57kb on my Ubuntu 8.04 laptop so disk space couldn’t be the reason why it’s not installed. It’s certainly not a “it’s a minimal install” argument, there’s lots of other things there by default.

Next step, let’s get updates (some time had elapsed between first install and now).

Seeing as I hadn’t met too much success with the graphical utility (it was at version 0.0000001 or something, so I don’t lay blame there). I find out that ‘pkg image-update’ is what you want to run. So I do.

It chugs for a while and says there’s 1GB of updates. That’s okay, I (where I=Sun) pay for what here on the arse end of the Internet is considered a decent link to my home office. About 20-30minutes later, having downloaded about 600MB, it goes “url timedout error” and aborts. Oh well I think, that’s easy - i’ll run it again and it’ll just resume downloading (remember the revolution when that started working, you know, in 1997).

I then discovered that pkg doesn’t resume downloads. It creates a snapshot using ZFS and puts the updates in it. If anything goes wrong, it just deletes the snapshot. This is a huge benefit over (say) dpkg, which if you press the reset button at the right time will leave your system very, very fucked (magic incantations can revive it, but it’s not fun - and the dpkg developers don’t think it’s a problem - come to my “Eat My Data” talk at OSCON to find out the full story). So OpenSolaris pkg wins on the “don’t ruin my working OS install already” front, but fails on resuming downloads.

I try again. Same story.

It’s now wasted a bit over 1GB of downloads… which equates to a couple of dollars.

I wait a few days, a week, and try again. Same story. I even try with a few hints found online that should fix things (well.. they did let another 100MB on average download before dying with the same story).

I then decided to just try and do the minimal - I wanted a development environment so I could build a MySQL Server with NDB and then play with DTrace to help nut out a performance problem or two.

So i tell pkg to install SunStudio Express. I’m even using instructions off sun.com, so it has to work.

It’s only ~500MB now (IIRC). Fails with exactly the same error as before (url timedout). Gah!

So, this brings us to today. I head into the Sun office.

I figure “this just has to work from a Sun office… ” and I was right!

It got through the (now) 1500MB download of updates!

It even applied them!

Success!

Win!

Well, no, - FAIL.

It now refused to boot with the updates. Or rather, it just rebooted soon after having started booting. No panic, no error screen, no “will reboot in 120 seconds” or anything useful. Instead, you just saw a flicker of the error message before it rebooted.

So… with some very careful pause/unpause of the VM (thanks VirtualBox… I also have a feature request now - pause before reboot :) I got this:

Aparrently the successful update, not so much.

Hrrm… perhaps select the known good one from the GRUB menu? It did actually boot! But this wasn’t just the old kernel, it was the whole older system. I guess that’s a possible upside of ZFS snapshots…. but oh my, that could be sooooo subtle and lead to data loss that it’s really quite dangerous.

I was still no closer to getting an up to date opensolaris system with enough developer tools to build a MySQL Server and use dtrace.

And this was enough. It’s now gone and I get my 10GB of disk back.

Maybe I’ll try again later… but I’m finding the google-perftools to be rather exciting and they’re really satisfying shiny thing urges at the moment.

WL4271 Encrypted Online Backup: Preview 3

Thursday, July 10th, 2008

“WL4271 Encrypted Online Backup: Preview 3” branch in Launchpad

Now with Windows support. Many thanks to Chuck Bell for helping get the code going on Windows.

We can however, all sit around dumbfounded as to how Windows has so little of a POSIX like layer and yet doesn’t define ENOTSUP.

As a refresher, this tree implements:

  • Encryption for MySQL Online backup
  • Algorithms and keysizes supported:
    • 3DES
    • AES (128, 192 and 256bit)
  • World peace

(world peace not included)

UPDATE: If you’re wondering why the branch isn’t there, it’s still pushing to launchpad. Yes, that’s over 7 hours to push a branch. ick. Can’t be too much longer, surely. I cannot wait until lp uses shared repos.

Security question fail.

Tuesday, July 1st, 2008

Spot the problem:

You work for company X.

  • Phone rings: “Hi, my name is Alice, I work for company X”
  • “Hi Alice, this is Bob, in order to verify that you do actually work for X, what is your employee number and phone extension, I’ll call you back when verified”.
  • “Okay Bob, it’s Alice, employee number 1234 and I’m on 555-5555″
  • You look up the employee database and sure enough, Alice is there with number 1234.

Were you talking to Alice?

Will you be talking to Alice if you dial 555-5555?

MySQL 5.1 Cluster DBA Certification Study Guide available again!

Monday, June 30th, 2008

Vervante Books Etc — MySQL 5.1 Cluster DBA Certification Study Guide

Only $40. Written by absolute brilliant people (and me). Needing to learn MySQL Cluster so you can go and use it? test it out? work out if it’s for you? Get this book!

NDB$INFO scanning from ndb_mgm

Wednesday, June 18th, 2008

In code just tested:

ndb_mgm> ndbinfo MEMUSAGE
RESOURCE_NAME    NODE_ID    PAGE_SIZE_KB    PAGES_USED    PAGES_TOTAL    BLOCK
IndexMemory,1,8192,16,160,DBACC
DataMemory,1,32768,20,640,DBTUP
IndexMemory,2,8192,16,160,DBACC
DataMemory,2,32768,20,640,DBTUP

Win!

This is the first time that we’ve been able to get this kind of info out of the cluster without using the magic “all dump 1000″ (or “all report MemUsage”) which end up using events, which go to the log file, aren’t exactly reliable etc.

This performs a scan on the NDBINFO tables (in ndbd) from ndb_mgmd and returns the result to the management client. You can then use this in scripts from the command line. e.g. to find out how many pages of datamemory are used on each node:

$ ./storage/ndb/src/mgmclient/ndb_mgm -c localhost:9311 -e ‘ndbinfo MEMUSAGE’|tail -n +3|grep ‘DataMemory’|cut -d ‘,’ -f4
20
20

now, just to clean it up a bit, fix the one bug (yes, you guessed it: in metadata caching) and get a review….

but, a milestone!

NDB$INFO

Tuesday, May 27th, 2008

There’s been talk over the years of better monitoring for NDB (MySQL Cluster). This has been dubiously named NDB$INFO, after some special magical naming convention for tables holding information on the insides of NDB. Otherwise known as Worklog 3363 (viewable on MySQL Forge).

The basic idea is to get a bunch of things that are already known inside NDB available through a rather standard interface (SQL is preferred).

My top examples are “How much DataMemory is used?” and “Do I need to increase MaxNoOf(Tables|Attributes|ConcurrentTransactions)?”. You can get some of this information now either through the management client (ndb_mgm -e “all report MemoryUsage”) or the MGM API using events and some other foo.

This is a rather limited interface though. It would be great if you could point all your monitoring stuff to a MySQL Server, throwing queries at it and finding out the state of your cluster.

So this year I’ve been working on implementing NDB$INFO. The big requirements (for me at least) are:

  1. Everything can be queried easly from SQL
  2. It’s easy to add a new NDB$INFO table (for a NDB developer)
  3. you can use NDB$INFO tables to diagnose problems (such as nodes not connecting)

Among the 492 things I’m currently doing, is fixing up a basic patchset for NDB$INFO and working on getting it into the tree. It’s all going to be basic scan interfaces in the current version, so things may be slow if there’s lots of rows, but they’ll get there.

What would you like to see exposed?

Encrypted Online Backup (design, thoughts, ask-the-lazyweb)

Tuesday, May 27th, 2008

So after a ever so temporary but loud moment of insanity[1] having a decision made which I very strongly disagreed with (wanting to release online encrypted backup as closed source), we’re back in the world of freedom and the MySQL Server is (and will be) free and open source software (dual licensed, so you can buy a commercial license of the same thing).

[1] Addition (wanting to remove my use of the word): Marten (rightly) points out that although appreciating the new blog posts, he doesn’t appreciate having his decisions called insanity. He’s right. It’s the wrong way to put it. So, without wanting to censor or change history (instead preferring to illustrate my own stupidity and amazing ability to completely say the wrong thing every 6 months or so), I offer this clarification (that i have tried to express in about 3 drafts of blog posts, none of which have made the light of day as i was never really happy with them): the decision was made with all the right intentions (grow the company, end up producing more free software, making sales to enterprises easier, clearer differentiation etc) but it was one that I (and many others) rather strongly disagreed with. In the end, the dicision was made to have these parts as free software and I truly believe that this was made after more arguments were presented by myself (and others) about why having these parts as closed was a bad idea. It is quite the thing to make the decision to make modules for your free software product closed, it is about 15 steps higher to go back on it. I’ll share a phrase I used a few times when being a right nick-picker about things during employment contract negotiation this year (for MySQL Australia and then Sun): “Do I trust Marten? Absolutely. It’s the next guy. Remember, SCO was once Caldera and producing a linux distro and generally considered good.” So, that was more than I intended to write on the subject… but hopefully clarifies that I just thought the decision itself was bad, and am lucky enough to work at a place that encourages discussion when you don’t like things.

So, now I’m involved with writing up the worklog for encryption for the MySQL server native online backup. I also wrote most of the original worklog for compression of online backup (I implemented compressed backup and LCP for MySQl Cluster) as well as some proof-of-concept code (written in <5 minutes at 3am while jetlagged).

There are two main approaches to encryption: symmetric and asymmetric (public key). I think we should support both (but we’ll see what others think).

For symmetric (password based for those not up with the street lingo of crypto) we’re thinking of the following algorithms: 3DES, AES, Blowfish. Are there any others that people care about?

DES is obviously out as it’s not considered secure, and really, we should be helping users to get things right.

For public key: RSA and DSA are the obvious choices.

As for libraries implementing all of these? well….. I’m thinking about libgcrypt - it looks fairly nice and a bit similar to the kernel crypto api (which seems quite nice). Anybody got any other suggestions? Things you’d like to see? thoughts?

EDIT: Server not Service. We sell services, the server is free and open source. I fail.

eHorizons

Tuesday, May 27th, 2008

I flew back into Sydney on Sunday morning to give a tutorial at Sun’s Expanding Horizon’s summit. It was a half day tutorial on MySQL Cluster - so a shortened version of the one I’ve given at the MySQL User Conference for the past few years. I had about 15 attendees, all of which had done their homework (It probably help that they were pestered via phone :)

The tutorial went really well. It really helps when everybody has done the homework and already have Linux and MySQL Cluster installed. Everybody got up and running (we used mysql-test-run to start a cluster, not writing the config file from scratch, which made things happen a lot faster). Also got some good feedback - yay! We may even have some people look to deploy it after attending, always a plus.

I also gave a “Scaling MySQL” talk that was well attended. I didn’t talk at all about query optimisation, mysqld configuration tuning or stuff like that - instead focusing on making the app saner, caching etc. memcached, of course, got a good mention :) It seemed to go down well, some good questions, and a rather full room.

So a rather productive two days for spreading the freedom love.

However, the conference dinner was complete FAIL on account of the venue. I don’t know which vegetarians/vegans call beef and fish vegetarian, but I’ve never met one (hint: they don’t exist). This is *after* the explanation on being vegan. Then… there was some discussion about pasta with a tomato/vegetable sauce, never came. So as others were finishing meals, again inquire - eventually, something was brought over. Undercooked rice and undercooked steamed vegetables. I don’t know who eats that for dinner (hint: nobody). Of course, after the pasta discussion, I then selected a wine that would go with it. After more of the stuff-ups, I pointed out that there was no way I was going to pay for the wine when shit like that was served (yes, in those words… perhaps I’ve been watching too much Gordon Ramsay).

It was the first time ever that I’ve left a restaurant during a function, gone down the street, gotten take away and brought it back. Novotel Brighton Beach (in Sydney) - you suck.

(there’s also a beutiful view across the bay of the runways of Sydney airport… which is fine if you can sleep through planes landing an taking off, like i can, but i know others can’t).

Will never stay at the Novotel Brighton Beach voluntarily, ever. On the plus side, the guy at the desk when checking out was very apologetic…

Speaking at Sun Extended Horizons Summit 2008

Saturday, May 24th, 2008

Sun Microsystems - Australia - APAC Extended Horizons Summit 2008

I’m giving a Cluster session tomorrow (Sunday) and talking about Scaling MySQL on Monday. Hope to see you there!

My Name Is…

Wednesday, May 14th, 2008

Stewart.

With a t at the end. Not a d.

Get it wrong once, possible mistake.

When I correct you, and you do it *again* and *again*, I want to slap you in the face with a keyboard.

It’s especially bad when you’re replying to my email, as my name is spelt correctly at least THREE TIMES right in front of you (Twice in “To: Stewart Smith <stewart@…>” and again in “On X, Stewart Smith wrote:”.

(and if you’re wondering why I’ve tagged this post with “sun” and “mysql” it’s because some of you people are the worst offenders)

kthxbye