variety in allocation block sizes?

some studies have shown that for multimedia applications, a larger block size improves throughput (e.g. 256kb blocks). For large media files, the waste of an average 128kb per file is insignificant (over several megabytes to many hundred mb or indeed GB). But, for smaller files (typically occupied by configuration files or small system binaries) a smaller block size saves more space (4k block for 100byte file).

ReiserFS goes for something slightly different allowing several files to share a single block. I am unsure if the extra effort involved in implementing this is worth it, or the slower access times that reiser reports for allowing this.

Maybe different allocation groups could have different block sizes? maybe there could be some kind of block-size migration system? or would the overhead not be worth it? Could it be one of those “maintenance” tasks that you run every month/year? How often does the average usage of a disk change that we’d need something like this?

block allocation

B+Trees sorted by size and location (a-la XFS) provides:
– ability to allocate large/small objects efficiently (size)
– ability to allocate blocks near existing objects (e.g. for object expansion) by using the location B+Tree

B+Trees are good, therefor use them.

Split up into allocation groups (a-la XFS and BFS). Allows more parallel operation as different threads can work on different allocation groups (during updating).

Idea for fine grained B+Tree locking:
– each node gets an ID (which should be pretty unique… but not essential to correct operation).
– when process needs to update node, places a lock on it’s ID
– when a process accesses a node, before reading, it checks to see if a lock has been placed on that ID. IF so, we spin, waiting for the lock to be released
– when a process has finished updating a node, it releases the lock on it’s ID.
– the process waiting to read it will stop spinning and be able to read the new copy.

IF IDs overlap, there is no problem, we’ll just be spinning when a different node is being updated.

However, if we are updating a tree of nodes, then some careful locking wil have to be employed, from the BOTTOM UP! that is, update from the bottom up, locking each individual one as we go as if we only lock the root of the tree being updated, another process could have already passed it and screw us up royally.

should be doing assignments

Yes, I should be doing all these uni assignments I’ve got. But seriously, fuck the coursework. I much prefer the project. Solving a tricky problem with some research potential. Now *that’s* fun!

What’s more, there’s not only the opportunity to work on an exciting OS (Linux) but also for something that’s even newer and weirder (Walnut). Although there’s no way in hell I’m touching Walnut code for a while yet. Not until it’s rewritten remotely sanely.

more notes from the board

Multivolume disks?
– uid object migration in distributed environment
– have a distributed volume (e.g. lab scenario, all lab machines have same ‘volume’ mounted and sync of primary machine)

Snapshots for all versions?
– each version would require > 1 block
i.e. 10 mods of 10 bytes to 1 block would require >= 10 extra blocks
hmm… has to be better way in this scenario….

B+Tree of unique ids (seperate from walnut object IDs).

poptop, pptp and the VPN issue

VPNs are just plain annoying

IPsec isn’t remotely easy enough to configure yet (i.e. it’s not easy point and click or at least point and type).

PPTP is butchery and icky and all wrong and not really *that* secure, but, MS OSs have support for it, and there’s a suppossedly working implementation that’s free. Unfortunately, i can *never* get the bloody encryption to work.

Freaky kernel hacks, modules that should exist but don’t (and don’t build)… grr….

All I want is a working VPN solution that runs on a free OS.

industry cocktails

Well, was at the industry cocktails night tonight – in fact, i’m still actually sitting in the main dining room at uni, but i’ve got wireless and they’re having a committee meeting :)

Two people actually came up to me later and gave me their cards telling me to talk to them. One guy has done stuff with XFS and the other is a Microsoft guy.

Could be interesting…..

Back to PHP hacking…. :)

meeting today

We had another ctte meeting today, probably verging on the breifest one so far. Not to say this is a bad thing, I love breif meetings – usually means there aren’t problems.

My main task is this membership database (that CSSE student club is also wanting). I’ve finally reset up my devel environment on my machine (this is what happens when you swap hard drives during semester) so can continue to make nice improvements to the GUI.

I’d welcome any help with it – so if you know PHP and SQL, give us a yell :)

Pia will be posting a summary to the linux-aus lists soon.

reading all about how to store stuff for reading

Otherwise titled: A day of reading about Filesystems.

I’m managing to continually find stuff I haven’t already read. I think this is a good thing. I’m clearly now going to have to do some more research into how RDMBS do things.

The idea of Snapshots sound interesting, and may prove an interesting way to help avoid having to do journalling. I’ve been thinking that journalling could be a real issue if we allow large transations, so maybe a snapshot like idea could be good. Although, this could limit the number of transactions we work on concurrently if it’s implemented the WAFL way.

So looking at how RDBMS do concurrent transactions would be useful. I guess we do have the advantage of having PostgreSQL and MySQL code to peer at but I do hope there are some nice papers out there to read :)

I’ve collected a large array of useful papers to read. I’m going to try and wade through them sometime soon, but will have to do more coursework soon.

I’m thinking tomorrow might contain a bit of coursework – need to do more POVRAY/OpenGL stuff as well as study more of the Pattern Recognition stuff. Oh, and look at the Info Security assignment & exercise for this week.

Free Open Diary Import

Well, I’ve imported all my old OpenDiary entries. Unfortunately, there’s no easy way to export all the notes from OD, so they haven’t made their way across.

Below is the perl script I used to convert ODs export format into the MovableType import format:

#!/usr/bin/perl

{
open ODFILE,”< $ARGV[0]"; @entries=(); while($_ = )
{
if(/^(.*?)\s+\- (\d\d?\/\d\d?\/\d\d\d\d)/)
{
$entries[$#entries+1]=[$1,$2,”];

}
else
{
s/\x0d//g;
my $entry = $entries[$#entries];
${@$entry}[2].=$_;
}
}
close ODFILE;
}

my $i=0;
foreach my $entry (@entries)
{
print “TITLE: “.@$entry[0].”\n\n”;
@date = split /\//,@$entry[1];
print “DATE: $date[0]/$date[1]/$date[2] 00:00:00\n”;
print “—–\n”;
print “BODY:\n”;
print @$entry[2];
print “

NetBSD Alpha

got it booting! got it nfsmounting / and have a console up.

It is thought of some that I won’t be able to get X going under NetBSD. I’ll give it a try (there’s some almost-docs outh there). If it doesn’t work, then maybe I’ll just have to settle for headless stations, or try another OS.

It looks like I can get a hobbiests license for OpenVMS from HP. This could be cool too. Sholud be able to get X11 going on it and connect to my debian box.

64MB ram should be enough for some tasks :)

On a down side, it looks like one of the screens doesn’t work. Hopefully it’s just a fuse or something. Hopefully it doesn’t explode while i’m fiddling with it too….

Could get Digital UNIX/OSF/True64 and run it too… basically I just want an X terminal with this lovely big screen :)

A linux port could be (sort of) possible… apparently there’s some TurboChannel support in the kernel for MIPS, just not for alpha. If what I’ve been told is correct, and NetBSD just feeds IO through the firmware (for console output), then I should be able to get linux to do the same trick so I don’t have to use a serial dongle to get a serial terminal going.

There is a start on Debian/NetBSD, which could be interesting, but currently it’s only one guy hacking. I’ve offerred to test at least :)

POVRAY is so broken

How come I always end up writing the Portfile for *really* broken build systems?

SPIM is pretty broken, POVRAY was worse. Really, really bad. I did a VERY nasty Makefile.in hack (I added in linking to libraries to the “*LINK=” lines).

Oh well, it’s installed now. You have to specify really weird command line arguments. Whoever came up with this weird stuff was really weird. The number of warnings coming up during the build phase was just scary.

Chatting to the Debian package maintainer for it, he said it was really broken to, and just hoped that I wasn’t trying to take it to a 64bit architecture, apparently it makes some nasty assumptions about the length of longs.

It’s running on PPC32 at least. Haven’t tried the X stuff, just outputting to PNG, which works fine. The amount of crap it spews out while rendering a screen is annoying though.

I’ve managed to get through some of the tutorial on povray.org too, got some shapes going and pictures coming out.

project overview doc

I’m trying to write up what has been discussed regarding my hons project.

Got a meeting penciled in for monday morning.

Should be good.

But I have to get all these ideas that have been in my head down on paper, and all the reasoning behind them.

Finally got LaTeX on my laptop. Not from DarwinPorts though :( Got it from some i-installer program that I guy did. It works, I’m happy.

DEC3000/300LX

I got 3 of these babies today (being really yesterday). Apparently they’re 125mhz Alphas with a min of 16MB RAM.

Apparently I can run True64, OpenVMS or NetBSD on them. Some people have tried to get linux going, but I haven’t found any success stories as of yet.

I’m tempted to put NetBSD on and play with it a bit… Another UNIX box can’t hurt, even if I just use it as an X terminal to a Linux box (they’ve got lovely big trinitron screens).

Was hoping to use them as Linux test boxes, but it looks like that’s going to be more trouble than it’s worth….. looks like it’s back to the P150….

transactions

I’m thinking that I can easily do multiple FS ops in one transaction with some careful structuring of the journal.

The only problem with this is the old problem of journal size. Unlike a normal journaled FS, with user transactions, we may be dealing with a lot more data in many transactions and a static log size may be a bad thing (and severely limit the size of transactions that are possible).

I’m thinking that having an expandable log (perhaps a linked-list kind of approach) could be quite useful. In memory, we’ll probably need another data structure to have things remotely efficient (esp if we want to know where to write additional log entries).

This could also be nice performance wise as we only have to lock one of these when doing operations on the log.

I’ve also been thinking about weird things like delayed index updates. That is, you update a file (or attributes) and it doesn’t update the index pointing to it on disk until it is conveinient for it to do so.