In what I laughingly call “spare time” I started hacking on ha_file.cc, otherwise known as the FILE storage engine. My idea is relatively simple, I want to be able to store and access my photos from MySQL. I also want the storage to be relatively efficient and have the raw image files on disk, not tied up too much in any different format (my file system is pretty good at storing multi-megabyte files thank you very much) – it also doesn’t require any fancy things to re-use space when I delete things. I should also be able to (efficiently) directly serve the images out of a web server (satisfying the efficiency itch). You could also use something like DMF to migrate old rows off to tape.
So, I started some hacking and some designing and have a working design and a nearly basically working read/write implementation. I’ll share the code when it does, in fact, actually work (by “work” i mean reads and writes basic rows).
I’ve decided to go for the approach of storing columns in extended attributes. Why columns? ’cause then you can access them either from the command line or programmatically through another interface. It also adds an extra layer of evil. With XFS and sufficiently large inodes, these should all fit in the inode anyway. ext3 also has some nice optimisations that should help with performance too.
For blob data, I plan to just store that in the file. In my table for photos example, you could then just run a image browser (e.g. gthumb) on the data directory for the table and see your images. It also means that recovery programs (see my jpeg_recover.c) will work as well.
Knowing the primary key of the row (which I plan to use as the file name for the row) then allows us to generate URLs that could be directly served by a lightweight http server, avoiding all that database code when you’re just serving up an image to a client.
Symbolic links can be used to have indexes.
We can write new rows to a temp directory, sync them, then move them into place. Zero time crash recovery. Index consistency can be handled at runtime with a small extra check.
At some point I should write down how I plan to do isolation levels too. but that’s for another day.
I at least hope that the resulting code may be a useful example for people wanting to implement a storage engine.
A simple implementation should be fairly fast too (with a slightly tuned file system).