Cold Brew Coffee Experiment #3

So, I did what I said I’d do – 1/3rd cup ground coffee + 2 cups water, overnight (about 18-20hrs) in fridge. Tasted really nice.

I think I’ve found the combination that works when I’m just making one.

Filtering out the grounds is certainly annoying… I guess this is what the Toddy is meant to solve…. because what I most certainly need is yet another way to make coffee.

Cold Brew Coffee Experiment #2

Two things varied from my previous attempt:

  1. I kept it in the fridge instead of on the bench (it’s been really hot in Melbourne and the fridge is closer to “normal” temperature than outside)
  2. doubled the amount of water

The difference? It’s certainly cooler, and a weaker flavour. That being said,  I think you get more of the subtleties of the flavour rather than being a bit thwacked with it. I found my last experiment nice, but perhaps a bit sharp. This batch tastes very smooth and refreshing.

This batch also seemed to be less gritty, with me seeing fewer grounds in the filter. I wonder if that’s due to fridge, variance of grind or water quantity.

I think for my next go I’m going to try same amount of coffee, in the fridge, but 2 cups of water.

Cold Brew Coffee Experiment #1

1/3rd cup ground coffee (ground approximately for what I’d use in a plunger) to 1.5cups water. Sit in airtight container on the bench overnight (about 18hrs actually) and then filtered through coffee filter paper into a mug.

It certainly is different, but still distinctly coffee. I’m not sure if I want to add more water too it or not… quite possibly – and this mixture would probably be total killer for iced coffee.

Basically, I’ve heard a million and one “one true ways” to make cold brew coffee, and this was my first go at it. Now just to try different variations until I find the one I like the most.

Puppet snippet for setting up a machine to build Drizzle

You could use this in a Vagrant setup if you like (I’ve done so for testing).

Step 1) Set the following in your Vagrantfile:

Vagrant::Config.run do |config|
  config.vm.box = "lucid32"
  config.vm.box_url = "http://files.vagrantup.com/lucid32.box"
  config.vm.provision :puppet
end

Step 2) Get puppet-apt helper.

I used https://github.com/evolvingweb/puppet-apt and put it in a manifests/ directory like so:

$ mkdir manifests
$ cd manifests
$ git clone git://github.com/evolvingweb/puppet-apt.git

Step 3) Write your puppet manifest:

import "puppet-apt/manifests/init.pp"
import "puppet-apt/manifests/ppa.pp"
class drizzlebuild {
        apt::ppa { "ppa:drizzle-developers/ppa": }
        package { "drizzle-dev":
                  ensure => latest,
        }
}
include drizzlebuild

Step 4) “vagrant  up” and you’re done! Feel free to build Drizzle inside this VM.

I’m sure there may be some more proper way to do it all, but that was a pretty neat first intro to me to Puppet and friends :)

Puppet + Vagrant + jenkins = automated bliss

I’m currently teaching myself how to do Puppet. Why? Well, at Percona we support a bunch of platforms for our software. This means we have to maintain a bunch of Jenkins slaves to build the software on. We want to add new machines and have (up until now) maintained a magic “apt-get install” command line in the Jenkins EC2 configuration. This isn’t an ideal situation and there’s been talk of getting Puppet to do the heavy lifting for a while.

So I sat down to do it.

Step 1: take the “apt-get install” line and convert it into puppet speak.

This was pretty easy. I started off with Vagrant starting a Ubuntu Lucid 32 VM (just like in the Vagrant getting started guide) and enabled the provision using puppet bit.

Step 2: find out you need to run “apt-get update”

Since the base VM I’m using was made there had been updates, so I needed to make any package installation depend on running “apt-get update” to ensure I was both installing the latest version and that the repositories would have the files I was looking for.

This was pretty easy (once I knew how):

exec {"apt-update":
       command => "/usr/bin/apt-get update",
}
Exec["apt-update"] -> Package <| |>

This simply does two things: specify to run “apt-get update” and then specify that any package install depends on having run “apt-update” first.

I’ve also needed things such as:

case $operatingsystem {
     debian, ubuntu: { $libaiodev = "libaio-dev" }
     centos, redhat: { $libaiodev = "aio-devel" }
     default: { fail("Unrecognised OS for libaio-dev") }
}
package { "libaio-dev":
          name => $libaiodev,
          ensure => latest,
}

The idea being that when I go and test all this stuff running on CentOS, it should mostly “just work” there too.

The next step? Setting up and running the Jenkins slave.

Without notmuch, I would simply delete your email

I have been using notmuch (http://notmuchmail.org/) as my email client for quite a while now. It’s fast. I don’t mean that everything happens instantly (some actions do take a bit longer than ideally they would), but with the quantity of mail I (and others) throw at it? Beats everything else I’ve ever tried.

I keep seeing people complain about not being able to keep up with various email loads and I am convinced it is because their mail client sucks.

I hear people go on about mutt…. well, I stopped using mutt when it would take two minutes to open some of my mail folders.

I hear some people talk about Evolution…. well, I stopped using Evolution when I realized that when it was rebuilding its index, I couldn’t use my mail client for at least twenty minutes.

Gmail…. well, maybe. Except that I don’t want all my mail to be sitting on google servers, I want to be able to work disconnected and the amount of time it would take to upload my existing mail makes it a non-starter (especially from the arse end of the internet – Australia). I also do not want email on my phone.

My current problem with notmuch? It just uses Maildir…. and this isn’t the most efficient for mail that never changes, some kind of archive format that is compressed would be great. Indeed, I started looking into this ages ago, but just haven’t had the spare cycles to complete it (and getting SSDs everywhere has not helped with the motivation).

Coming back from vacation, my mailbox had about 4,700 messages sitting in it. I’ve been able to get through just about all of them without blindly deleting mail. This is largely due to the great UI of notmuch for being able to quickly look at threads and then mark as read, quickly progressing to the next message. I can tag mail for action, I can very quickly search for email on an urgent topic (and find it) and generally get on with the business of getting things done rather than using an email program.

HOWTO fix: bzr join error of “Trees have the same root”

From https://answers.launchpad.net/bzr/+question/71563

you can do it from within Python like this:
>>> import bzrlib.workingtree
>>> bzrlib.workingtree.WorkingTree.open(“subdir2”).set_root_id(“tree_root_subdir2”)

Hopefully I can find this easily in the future (have had to use it before)

Friendly exploits

If you happen to be friends with me on Facebook you will have seen a bunch of rather strange updates from me last night. This all started with a tweet (that was also sent to Facebook) by a friend who joked about doing something with the <MARQUEE> tag (see http://www.angelfire.com/super/badwebs/ for an example of it and similar things). I saw the joke, as I was reading it through Gwibber or the Facebook website. However…. Leah saw text scrolling over the screen… just like the <MARQUEE> tag actually did.

She was looking at it on her iPad using an app called Friendly.

So I immediately posted a status update: “<script lang=”javascript”>alert(“pwned”);</script>”. This is a nice standard little test to see if you’ve managed to inject code into a web site. If this pops up a dialog box, you’ve made it.

It didn’t work. It didn’t display anything… as if it was just not running the script tag. Disappointing. I soooo wanted it to break here.

I did manage to do all sorts of other things in the Live Feed view though. I could use just about any other HTML tag… including forms. I couldn’t get a HTTP request to my server out of a HTML form in the Live Feed view… but once we did manage to crash Friendly (enough that it had to be force quit on the iPad).

I posted a photo of me holding up the iPad to my laptop web cam to show off the basics:

And then one of what happened when I tried a HTML form (this wasn’t reproducible though… so kind of disappointing):

What we did notice however was that HTML tags were parsed in comments on images too…. which made me wonder… It’s pretty easy to make a HTML form button that will do something… so I posted the same image again with a button that would say “Next” but would take you to a web page on one of my servers instead. It worked! I got a HTTP request! Neat! I could then present a HTML page that looked legit and do the standard things that one does to steal off you.

But I wonder if scripts would work…. so I posted:

Photos are proving more exploitable.... <script lang="javascript">alert("pwned");</script>

and then clicked on the image on the iPad……

Gotcha!

I could from here do anything I wanted.

Next… I should probably report this to the developers…. or steal from my friends and make them post things to facebook implying improper relationships and general things that would get you fired.

I went with the former… but the latter would have been fairly easy as the Facebook page for the app nicely tells me which of my friends use it. I could even target my attack!

So I sent a warning message to friends (the 18 of them who use the Friendly app), sent a “contact the developer” message to the developers, sent out a warning on Twitter and went to bed.

Got an email overnight back from the developer: “We just pushed a server update that solves this issue.”

Now… in my tcpdump while trying some of the earlier things I was just seeing https requests to facebook API servers from the iPad, but I don’t thing I looked too closely at images. I have no idea if they’ve actually fixed the holes and I don’t have an iPad to test it on. If you do, go try it.

HailDB: A NoSQl API Direct to InnoDB

At the MySQL Conference and Expo last week I gave a session on HailDB. I’ve got the slides up on slideshare so you can either view through them or download them. I think the session went well, and there certainly is some interest in HailDB out there (which is great!).

innodb and memcached

I had a quick look at the source tree (I haven’t compiled it, just read the source – that’s what I do. I challenge any C/C++ compiler to keep up with my brain!) that’s got a tarball up on labs.mysql.com for the memcached interface to innodb. A few quick thoughts:

  • Where’s the Bazaar tree on launchpad? I hate pulling tarballs, following the dev tree is much more interesting from a tech perspective (especially for early development releases). I note that the NDB memcached stuff is up on launchpad now, so yay there. I would love it if the InnoDB team in general was much more open with development, especially with having source trees up on launchpad.
  • It embeds a copy of the memcached server engines branch into the MySQL tree. This is probably the correct way to go. There is no real sense in re-implementing the protocol and network stack (this is about half what memcached is anyway).
  • The copy of the memcached engine branch seems to be a few months old.
  • The current documentation appears to be the source code.
  • The innodb_memcached plugin embeds a memcached server using an API to InnoDB inside the MySQL server process (basically so it can access the same instance of InnoDB as a running MySQL server).
  • There’s a bit of something that kind-of looks similar to the Embedded InnoDB (now HailDB) API being used to link InnoDB and memcached together. I can understand why they didn’t go through the MySQL handler interface… this would be bracing to say the least to get correct. InnoDB APIs, much more likely to have fewer bugs.
  • If this accepted JSON and spat it back out… how fast would MongoDB die? weeks? months?
  • The above dot point would be a lot more likely if adding a column to an InnoDB table didn’t involve epic amounts of IO.
  • I’ve been wanting a good memcached protocol inside Drizzle, we have ,of course, focused on stability of what we do have first. That being said…. upgrade my flight home so I can open a laptop… probably be done well before I land….. (assuming I don’t get to it in the 15 other awesome things I want to hack on this week)

They took my Kodachrome away

It’s done. It’s gone. You can still find rolls on eBay and the like, but you’re not going to get them developed as colour slides. Dwayne’s Photo stopped accepting Kodachrome on December 30th 2010. I have a set on flickr of my Kodachrome shots. The scans do not do them justice. They look way better projected. I will also talk about the demise of Kodachrome without mentioning the Paul Simon song. oh, wait. fuck.

I’ve now gotten all my Kodachrome back. The last package arrived while I was away at linux.conf.au 2011 in Brisbane.

It is certainly the end of an era, and the last of my shots will go up on flickr.

OpenOffice.org is the most frustrating piece of software I use

No, really.

I have recently been constructing a 100 page document going over a whole bunch of the details for the Monorail we’re building at Burning Man this year.

Apart from randomly freezing, and then suddenly not displaying images until I had restarted it – it’s also really slow.

The last straw was when leafing through the document before getting it printed. I had inserted a bunch of pages before this last section. But now, there was this empty page in the last section of the document.  The part that I hadn’t touched for days. If I tried to remove the blank page, all the images on nearby pages moved so that they were on top of each other.

I ended up just printing it. There is a blank page that I can’t get rid of.

It is a piece of software that worries me. Is this really meant to be an alternative? It has NEVER worked well for me. Basic tasks sure, but I continually find myself pining for Word 5.1a on the Mac (System 7 that is) or Nisus Writer or even ClarisWorks.

If opening Microsoft Word documents fairly accurately is your only good feature, how do you expect to survive in the free (software) world?

So, while my twitter stream may suggest desires for punning the developers in the face or their early demise through painful methods….. I really just wish that sometime in the past 10 years you had made it not shit me to tears.

Certainly another failure of Sun Microsystems and I don’t expect Oracle to do any better at all (especially considering recent actions).

ENUM now works properly (in Drizzle)

Over at the Drizzle blog, the recent 2010-06-07 tarball was announced. This tarball release has my fixes for the ENUM type, so that it now works as it should. I was quite amazed that such a small block of code could have so many bugs! One of the most interesting was the documented limit we inherited from MySQL (see the MySQL Docs on ENUM) of a maximum of 65,535 elements for an ENUM column.

This all started out from a quite innocent comment of Jay‘s in a code review for adding support for the ENUM data type to the embedded_innodb engine. It was all pretty innocent… saying that I should use a constant instead of the magic 0x10000 number as a limit on an assert for sanity of values getting passed to the engine. Seeing as there wasn’t a constant already in the code for that (surprise number 1), I said I’d fix it properly in a separate patch (creating a bug for it so it wouldn’t get lost) and the code went in.

So, now, a few weeks after that, I got around to dealing with that bug (because hey, this was going to be an easy fix that’ll give me a nice sense of accomplishment). A quick look in the Field_enum code raised my suspicions of bugs… I initially wondered if we’d get any error message if a StorageEngine returned a table definition that had too many ENUM elements (for example, 70,000). So, I added a table to the tableprototester plugin (a simple dummy engine that is loaded for testing the parsing of specially constructed table messages) that had 70,000 elements for a single ENUM column. It didn’t throw an error. Darn. It did, however, have an incredibly large result for SHOW CREATE TABLE.

Often with bugs like this I may try to see if the problem is something inherited from MySQL. I’ll often file a bug with MySQL as well if that’s the case. If I can, I’ll sometimes attach the associated patch from Drizzle that fixes the bug, sometimes with a patch directly for and tested on MySQL (if it’s not going to take me too long). If these patches are ever applied is a whole other thing – and sometimes you get things like “each engine is meant to have auto_increment behave differently!” – which doesn’t inspire confidence.

But anyway, the MySQL limit is somewhere between 10850 and 10900. This is not at all what’s documented. I’ve filed the appropriate bug (Bug #54194) with reproducible test case and the bit of problematic code. It turns out that this is (yet another) limit of the FRM file. The limit is “about 64k FRM”. The bit of code in MySQL that was doing the checking for the ENUM limit was this:


/* Hack to avoid bugs with small static rows in MySQL */
  reclength=max(file->min_record_length(table_options),reclength);
  if (info_length+(ulong) create_fields.elements*FCOMP+288+
      n_length+int_length+com_length > 65535L || int_count > 255)
  {
    my_message(ER_TOO_MANY_FIELDS, ER(ER_TOO_MANY_FIELDS), MYF(0));
    DBUG_RETURN(1);
  }

So it’s no surprise to anyone how this specific limit (the number of elements in an ENUM) got missed when I converted Drizzle from using an FRM over to a protobuf based structure.

So a bunch of other cleanup later, a whole lot of extra testing and I can pretty confidently state that the ENUM type in Drizzle does work exactly how you think it would.

Either way, if you’re getting anywhere near 10,000 choices for an ENUM column you have no doubt already lost.