Tightly packing onodes

The current problem is that an onode, however much we can pack forks into a block, still takes up a minimum of one disk block. A disk block typically being 4kb, and a tendancy to want to be bigger (think large media files), and also being the unit of atomicity with disk writes.

So, how do we allow multiple onodes per block?
We could take the inode table way of doing things, and just have “an onode is X bytes, and X/block size = Y many onodes per block”, but that does have a lot to be desired – we may want variable sized forks to be stored along side onodes (e.g. what would typically go in an inode).

One way is to split each block into N “sub blocks” or “chunks” (or “insert-cool-name-here”). Basically have a block bitmap with N bits per block, and a chunk size of block size / N. This would allow us to have N onodes per block. Simple to implement (we wouldn’t even have to change the onode_index, as we could do a simple linear search for the onode, even storing them in onode_num order, enabling a binary search). But, if we had a 256kb block size, this means 32k per onode, no matter what. Annoying if the volume has both large media files (where the 256k block size helps), and small files (a unix like operating system or even a Maildir). Volumes are now big, and users like having one big file system on them – so we must be more flexible.

Do we want to (can we?) provide onodes which span blocks? My feeling is no to the latter – as in within the onode struct itself. Having an onode which has forks in other blocks seems like a quite reasonable (and indeed, needed) thing to do. So, we could have lots of onodes tightly packed into a disk block, with the forks being in other blocks. Typically though, you probably want at least one fork packed with the onode, as there aren’t many operations on the onode itself.

So, if onodes (along with some forks) can be a variable size, and we want to pack these into (quite possibly) large blocks, how are we going to do it?

I reckon we can pack them all into one block, with padding where needed (to have it so that no onode crosses atomicity borders) and rely on packing things in a block in a cache friendly manner.