weekly builds

Saturn’s autoweb

I’ve hacked my scripts that generate doxygen docs to also build MySQL 4.1, 5.0 and 5.1 for AMD64 (the box that it’s running on) with Cluster. This is to help my idea of running Gallery at home with NDB disk data tables in very recent MySQL builds.

How’s it going so far? Well… I’ve found some bugs and some seemingly strange behaviour here and there. However, bug reports will come, and I’m currently running a bit of an older build.

I’ll make the URL of the Gallery public at some point too

More training needed with gradings coming…

Made it to the Honbu today for the first time in a while. I really need to train more – have a Judo competition on Monday night (would be Saturday, but I have the LA face2face, so I’m going to the alternate).

The good thing about training on Thursday evenings is that there tends to be a lot of people on the mat – often more than at any other time. A lot of black belts show up too… some I don’t see too often (but that could be due to me generally not training as much this year).

At the end of November is gradings… so better be prepared. That, and I need to build up some fitness again.

Didn’t stay for the Sword class though, was quite knackered after the 1.5hrs of jiujitsu (well… more Judo style today). The class was good too – Kancho was on the mat and doing things for a while.

I feel I fixed some problems with my 5th and 6th hip throws today (or at least learnt where I tend to go wrong). My 5th leg still isn’t the best… but practice, practice. I think it’s more a consistency issue – sometimes I can just hammer in for the hip throws and execute them beautifully (at least for my grade…. nothing like a 3rd, 4th, 6th or 10th Dan or so throwing you around for a bit to see how far you can still go).

I totally vote we open a Melbourne MySQL office right near there (which is actually further from home for me.. which would be rather annoying, but means I could train every day :)

Recent happennings and releases…

So a bunch of stuff has happenned (or happenning) that I’ve been wanting to blog about for a bit. Some stuff had to wait, others it’s just been me being slack.

Anyway, anyone who hangs closely around the MySQL circles probably now knows about MySQL Enterprise. There’s been a fair bit of talk about this internally for a little while now. When it was being talked about a bit wider within the company some of the initial communication was (in my mind) rather unclear. So I took the “what’s the worst way somebody could interpret this” viewpoint and replied with my thoughts. The idea behind this was to simulate what some of the loud-mouthed trolls of the non-shifted question mark e on a qwerty keyboard mapped to dvorak kind may do.
After a few phone calls (some at strange hours) my worst fears were not realised – we were still not being insane.

So I hope I’ve been of some use in making sure that communication has been clear and any possible fears put to rest.
There is also an increased willingness to make things saner for getting non-MySQL AB authored code into the main trees (err… now labelled Community).

We’re also getting geared up for another 5.1 release – the Cluster team has recently chased down some failures: from out-of-disk on build machine (why is it us who had to find that out?) to an “actual bug”.

Kudos goes out to Jonas who has recently found a few bugs that have been Can’t Repeat since about the year 2000 – ones that were real hard to hit, but naturally, somebody has.

I also added some new things to the Cluster Management Server (ndb_mgmd) in 5.1 that should help with debugging in the future. I basically just exposed the MgmApiSession stuff a bit, giving each session a unique id (64 bit int) that you could then check if the session had gone away or not (or list all sessions). This gives us a test case for bug 13987 which is pretty neat.

I also have geared up a change to the handler API to fix bug 19914 – and being a good boy, I’ve mailed out to the public internals list so that people are ready for the building of outside of tree storage engines to break (on 4.1 and up!). The good news is, however, that this is a real fix and that any errors on COUNT(*) will be reported back to the user (a customer was affected by this).
Also, I updated how engines fill out the INFORMATION_SCHEMA.FILES table to make it a bit nicer (Brian wants to add support for it to some of the other engines). He also pointed out a really obvious bug of mine in a recent push to that code (that probably showed up in a compiler warning come to think of it…). Paul is looking at it for PBXT too (or at least thinks it’s cool :).
Also had a bit of a ask-around the cluster team about if making the team trees (VERSION-ndb) public (up on bkbits.net etc) was good. Nobody seems to have any objections, so will (as soon as I get a minute) persue that. Basically it’ll let people get access to the latest NDB bug fixes in source-tree form (certainly not recommended for production, but could be useful in testing environments).
I’ve also been thinking about talks for the MySQL UC next year as Cluster tends to be a popular topic (had a rather full room this year).

There’s probably more to talk about too, but i’m getting sleepy.

Rusty on LCA talks and other stuff…

As email is *sooo* non-“Web 2.0”, i reply in blog form….
Rusty’s Bleeding Edge Page talks about a “Writing an x86 hypervisor: all the cool kids are doing it!” session that sounds really cool (better not be on at the same time as my talk… :)

I don’t (currently) intend to be one of the cool kids though.

He also mentions a session entitled “First-timer’s Introduction to LCA”. A couple of possible suggestions (or thoughts, and stuff I’ve seen):

  • be careful if you intend to bitch endlessly about a piece of software – it’s quite likely you’re talking to the person who wrote it (or a chunk of it)
  • sometimes it can be really good to just listen and ask a few good questions to understand. there are a lot of really smart people about
  • you will (at some point) ask a really dumb question (that you’ll only realise is dumb a few months later). Don’t panic – we all do it.
  • Don’t be scared – nobody bites too hard.
  • when staying in the halls, odds are the coffee isn’t that good – be prepared to bring your own or go out every morning.
  • do not be afraid to go up and start talking to people – it’s a great way to meet interesting characters and cool hackers.
  • wash
  • use deodorant
  • encourage others to do the above 2
  • read the summary of a session, not just the title. sometimes you can be misled by the title (for example, not everybody thinks of the same thing when “hacking BLAH” is the title of a session)
  • especially if talking, bring backups, backup (without erasing old backups) and backup. Also, be sure restore works.
  • While a lot of people do enjoy downing a few (or more than a few) Ales, it’s not compulsary. There are people attending LCA who don’t drink (and who may/may not join others at the pub even though they don’t drink alcohol). It’s also okay to not drink too much – in fact, it’s often recommended.
  • Don’t be afraid to ask people who they are, what they do etc. Even if you then immediately recognise the name, it’s good to put a face to the name.
  • You will never see everything you want to.
  • do join the IRC channels – great way of meeting people and organising groups to go do things (like get food, go to pub etc).
  • do talk to people around the dorms – great way of meeting people
  • expect to want a day of rest afterwards
  • there are some “in” jokes – but don’t be afraid to ask what they’re about, strange traditions are part of the LCA experience

I wonder what should/could be written about going all fanboy/fangirl over favourite hackers? and taking/asking to get taken photos?

The last thing Rusty talks about is the “Hacking in groups” tutorial. I really liked his and Robert Love’s tutorial in Canberra (Kernel Hacking – where you wrote a PCI driver for the excellent Love Rusty 3000. A device with real specifications, coffee cup stain and all). I’ve had a bit of a mixed feeling about it from Rusty since then, but I reckon it was seriously one of the best tutorials I have ever attended. I also took the hands-on approach as great inspiration for various MySQL Cluster Tutorials I’ve given since (and people have commented on how the hands-on part is great).

I guess the thing about the kernel hacking tute was that not everybody in the room was at the same skill level (which is something you totally run the risk of with hands-on). Also, if you hadn’t done the prep material, you were probably going to be in trouble.

But anyway, the idea of having 20 talented coders with 5 people in the tute for each of them and working on some project could be interesting – although rather ambitious. I worry that people without a good enough skillset would rock up and not get much out of it. Although those with adequate skill would do well.

Picking a project that could be doable in a handful of hours (or a day) is tricky – as it’d probably be an extension to some existing project, which requires learning of it. Or, starting something from scratch can be equally as hard (to end up anywhere useful).

Some ideas for projects could include:

  • linux file system driver (perhaps read only) for a simple file system (mkfs provided)
  • MySQL table handler for some simple format (indexes get trickier… but maybe simple bitmapped index… or just an in memory table handler)
  • fsck for some file format/file system format

These have the benefit of being able to run existing good test suites against the software and see how well people did. They’d probably also help people land jobs :)

Another interesting one would be implementing a library for journaling writes to a file. i.e. instead of write to temp, sync, rename – do journaling.  This would let people easily write apps that did safe updates to large files. You could then use this to implement other things (like a really simple crash-safe storage engine, FUSE file system or something).

I’m just not sure how much “cool tricks” could really happpen in that time (instead of just getting the job done). 20 coders talking about their neat tricks would probably make a good book though…

Sound Volume

I like listening to music while I work. I also like notification sounds – such as gaim chiming when messages are received (so I look at them) and such things.

I use an iMic USB audio dongle to output sound to my headphones (partly because the connector on my laptop is a bit dodgy now) and I’ve detailed in the past how support for hotplugging of audio devices leaves a lot to be desired (it’s worse than it used to be sadly – I used to just be able to run esd against sound device and all was hunky dory).

What currently gets me is that music can be an adequate volume and then WHAM this loud gaim notification comes through.

Setting gaim to be softer and music to be louder isn’t immediately obvious and is easy to get wrong. It’d be great if the Volume Control applet could tweak it all from one place (and there was a way to change what the drop down volume applet controlled).

Doctor != Hacker

Thoughts on manadotry registration of IT professionals.

Having the argument for this and comparing to “we have it for doctors” doesn’t fly. If you start playing doctor on random people, you can kill them.

Writing code and whacking it up on the net can in no way directly cause harm to someone the same way as DIY heart surgery could.

Anybody who goes and grabs random code out in the wild and runs a system on it on which human life depends gets everything they deserve. They’re the bad guys here – not those writing and sharing code.

So how do you make sure this person constructing a system on which life depends is competant? The same way you do for everybody you hire – check their resume, talk to them, have appropriate checks and balances in place.

Just because somebody has a sheet of paper means nothing about their actual ability. Remember those crappy teachers from your school years? They all had teaching degrees. Rember how the university student tutor you had was a lot better than the teacher? Hrrm… that teaching degree obviously means a lot when it comes to ability then.
I certainly wouldn’t hire at least 80% of my past fellew undergrad students – even though they have the same sheet of paper as me.

Please, everybody go read The Daily WTF and see how much even experts with certifications can get it so, so, so wrong.

Saturn comes back around…

For certain evil purposes last week, I assembled the old Saturn with a hard disk I found when cleaning a little while ago (I have that kind of tech stuff – you clean up and find 40GB disks – I’m pretty sure I have an 8.4 bumming around somewhere too).

Saturn comes back around

I ended up being able to do the evil I needed to, but I could tell that the room was a bit warmer due to the extra box being alive. I was also lazy and couldn’t be bothered going downstairs for the D200, so this was shot with my old and trusty Coolpix 4500.

I used the box to be able to get remote access to a customers’ test setup to do some diagnosis on a bug (that’s notoriously hard to reproduce). I think I have a fair idea of what it is now though (timing related – not fun).

Remember kids, threads are evil.

Also, an interesting thing to note is that there is, in fact, a limit to not the number of fds you can pass to the select(2) system call, but to the actual number (on my Ubuntu box here, passing a fd of, say 2000 is probably going to lead to trouble). This has nothing to do with the previously mentioned bug, but an interesting point.

and the morning annoyance award goes to….

goes to VMware. Honestly, why every time i go and upgrade a kernel or  version of the free (as in beer) VM it asks me about serial numbers.

They also get a “annoyance award” for not listing Victoria as a state that could be in Australia on their web site. They do list other Australian states though (e.g. Westeren Australia and the Australian Capital Territory) yet not one of the most populous.
Or it should really go to Solaris. What a pain in the arse to get to the point of being able to compile $random_free_software_project. Look at Ubuntu/Debian: install system, apt-get build-dep $project, grab source, build. No fucking around with PATH or some strange application to do security updates (which I don’t know how on earth I figured out – I know that somebody else I work with hasn’t been able to easily find it). Why oh why is it so hard? Can’t there be an easy way? Please, somebody enlighten me!

WRT54GL client mode OpenWRT fun!

the wireless USB dongle I had running on my MythTV box had drivers that weren’t always reliable. I have recently totally decided that if I haven’t had time to debug them and fix the problems by now, I won’t in the near future.

Today a courier arrived with two of the Linksys WRT54GL for me. yay! My aim is to put OpenWRT on them and use them in client mode (one for me, one for mum) to get around unstable wireless drivers.

I just set mine up and it works! MythTV box now much more reliably on the network!

Although, I did hit one snag – the MAC address on the sticker on my unit was NOT the actual MAC address of the router. Really annoying when setting up MAC filtering. Grr….

(i really should set up better wireless security here)

Twinhan USB DTV dongle not working :(

so after doing some researching (read: using search engines with linux + product name), I came to the conclusion that a Twinhan USB2.0 DVB dongle would be the dongle for me. Yes – it’s small, compact and does digital tv without requiring a non-existant free PCI slot in my Shuttle MythTV box.

Having had great success with my last bit of new hardware (a really cheap Logitech QuickCam Express or something) – plug it in and it “just works”. Oh Linux how you are better than Microsoft Windows for hardware usability!

But this was not to be. It uses a vp7045 chipset, which has drivers both in Ubuntu 6.06 “Dapper” and in the latest v4l-dvb hg tree.

But for the life of me I couldn’t get it to tune into any TV stations (for those of you who like using hardware and not just having expensive boxes around, you will appreciate how tuning into a TV station is rather important functionality for a TV card). So I started having a look around the interweb for possible answers.

The best I could come up with was “are you sure you have all the cables plugged in” – yes, I was.

So seeing as this is the first digital TV dongle in this house, I wondered if the signal just wasn’t getting here. I got a friend to bring around a spare digital set top box. It worked fine. Brilliantly in fact – it even worked with the shitty small antenna that came with the dongle. So it wasn’t an ability to receive.

I then came across this post to the linux-dvb list titled “New VP7045 with TDA10046 instead of MT352 (was: VP7045 tuner doesn’t work)”. Which really does hint at the problem!

I could be one of the lucky ones with a new revision that uses the TDA10046 instead of the MT352! (after getting some debug info from the card out of the driver – it was reporting itself as v1.02, so quite possible).

Maybe time to hack the dvb driver for it? Things seem pretty modular, so it couldn’t be too hard, right?

Well, the vp7045-fe.c file is the front end (well, what it assumes is the front end) for the vp7045.c dongle. So all I really need to do is to get it to use the tda10046 frontend (under frontends/tda1004x.c) instead of the vp7045-fe.c fe code.

Well, it seems as though the tda10046 is an i2c device while the vp7045-fe isn’t. Hrrm… I’ve never really done much with i2c, so this’ll be fun!

I’ve currently managed to hack the driver so that we do some things to do with the tda chip – although i haven’t gotten in detecting the i2c adapter – which means we’re never going to get a front end! (in fact, when you plug in the device with my modified driver you get a “no frontend detected” message from the kernel).

i’ve tried poking on the #linuxtv channel on freenode to no avail – so it seems like i’m on my own for a bit.

A good way to spend midnight until 3am though :)

I’ll probably end up doing the same tonight. Why? Because it’s just so much fun.

Oh, and if anybody has any pointers – it would be appreciated.

I am, of course, assuming the hardware itself isn’t faulty. I have no MS Windows system around to test on.

dosbox

I showed kit dosbox. She’s now playing alleycat (sorry, ALLEYCAT.EXE) on it and we’ve all forgotten that we were actually hungry.

Of course, I did have to play a bit of Hugo’s House of Horrors – sorry, HHH.EXE.

Oh old DOS games, how awesome you are.

Storing Passwords (securly) in MySQL

Frank talks about Storing Passwords in MySQL. He does, however, miss something that’s really, really important. I’m talking about the salting of passwords.

If I want to find out what  5d41402abc4b2a76b9719d911017c592 or 015f28b9df1bdd36427dd976fb73b29d MD5s mean, the first thing I’m going to try is a dictionary attack (especially if i’ve seen a table with only user and password columns). Guess what? A list of words and their MD5SUMS can be used to very quickly find what these hashes represent.

I’ll probably have this dictionary in a MySQL database with an index as well. Try it yourself – you’ll probably find a dictionary with the words “hello” and “fire” in it to help. In fact, do this:

mysql> create table words (word varchar(100));
Query OK, 0 rows affected (0.13 sec)
mysql> load data local infile ‘/usr/share/dict/words’ into table words;
Query OK, 98326 rows affected (0.85 sec)
Records: 98326  Deleted: 0  Skipped: 0  Warnings: 0

mysql> alter table words add column md5hash char(32);
Query OK, 98326 rows affected (0.39 sec)
Records: 98326  Duplicates: 0  Warnings: 0

mysql> update words set md5hash=md5(word);
Query OK, 98326 rows affected (3.19 sec)
Rows matched: 98326  Changed: 98326  Warnings: 0
mysql> alter table words add index md5_idx (md5hash);
Query OK, 98326 rows affected (2.86 sec)
Records: 98326  Duplicates: 0  Warnings: 0
mysql> select * from words where md5hash=’5d41402abc4b2a76b9719d911017c592′;
+——-+———————————-+
| word  | md5hash                          |
+——-+———————————-+
| hello | 5d41402abc4b2a76b9719d911017c592 |
+——-+———————————-+
1 row in set (0.11 sec)
mysql> select * from words where md5hash=’015f28b9df1bdd36427dd976fb73b29d’;
+——+———————————-+
| word | md5hash                          |
+——+———————————-+
| fire | 015f28b9df1bdd36427dd976fb73b29d |
+——+———————————-+
1 row in set (0.00 sec)
$EXCLAMATION I hear you go.

Yes, this is not a good way to “secure” passwords. Oddly enough, people have known about this for a long time and there’s a real easy  solution. It’s called salting.

Salting is prepending a random string to the start of the password when you store it (and when you check it).

So, let’s look at how our new password table may look:

mysql> select * from passwords;
+——+——–+———————————-+
| user | salt   | md5pass                          |
+——+——–+———————————-+
| u1   | ntuk24 | ce6ac665c753714cb3df2aa525943a12 |
| u2   | drc,3  | 7f573abbb9e086ccc4a85d8b66731ac8 |
+——+——–+———————————-+
2 rows in set (0.00 sec)
As you can see, the MD5s are different than before. If we search these up in our dictionary, we won’t find a match.

mysql> select * from words where md5hash=’ce6ac665c753714cb3df2aa525943a12′;
Empty set (0.01 sec)

instead, we’d have to get the salt and do an md5 of the salt and the dictionary word and see if the md5 matches. Guess what, no index for that! and with all the possible values for salt, we’ve substantially increased the problem space to construct a dictionary (i won’t go into the maths here).

mysql> create view v as select word, md5(CONCAT(‘ntuk24′,word)) as salted from words;
Query OK, 0 rows affected (0.05 sec)

mysql> select * from v where salted=’ce6ac665c753714cb3df2aa525943a12’;
+——-+———————————-+
| word  | salted                           |
+——-+———————————-+
| hello | ce6ac665c753714cb3df2aa525943a12 |
+——-+———————————-+
1 row in set (2.04 sec)

mysql> create or replace view v as select word, md5(CONCAT(‘drc,3′,word)) as salted from words;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from v where salted=’7f573abbb9e086ccc4a85d8b66731ac8’; +——+———————————-+
| word | salted                           |
+——+———————————-+
| fire | 7f573abbb9e086ccc4a85d8b66731ac8 |
+——+———————————-+
1 row in set (2.12 sec)

So we’ve gone from essentially instantaneous retreival, to now taking about 2 seconds. Even if I assume that one of your users is going to be stupid enough to have a dictionary password, It’s going to take me 2 seconds to check each user – as the salt is different for each user! So it could take me hours just to find that user. Think about how many users are in your user table – with 1000 users, it’s over 1/2hr. For larger systems, it’s going to be hours.