new .emacs snippet

for the non lisp hackers – this sets some c mode options depending on the name of the path to the source file.


;; run this for mysql source
(defun mysql-c-mode-common-hook () (setq indent-tabs-mode nil))

;; linux kernel style
(defun linux-c-mode-common-hook () linux-c-mode)

(setq my-c-mode-common-hook '(lambda ()
(turn-on-font-lock)
(setq comment-column 48)
)
)

;; predicates to check
(defvar my-style-selective-mode-hook nil)

(add-hook 'my-style-selective-mode-hook
'((string-match "MySQL" (buffer-file-name)) . mysql-c-mode-common-hook)
)

(add-hook 'my-style-selective-mode-hook
'((string-match "linux" (buffer-file-name)) . linux-c-mode-common-hook)
)

;; default hook
(add-hook 'my-style-selective-mode-hook
'(t . my-c-mode-common-hook) t)

;; find which hook to run depending on predicate
(defun my-style-selective-mode-hook-function ()
"Run each PREDICATE in `my-style-selective-mode-hook' to see if the
HOOK in the pair should be executed. If the PREDICATE evaluate to non
nil HOOK is executed and the rest of the hooks are ignored."
(let ((h my-style-selective-mode-hook))
(while (not (eval (caar h)))
(setq h (cdr h)))
(funcall (cdar h))))

;; Add the selective hook to the c-mode-common-hook
(add-hook 'c-mode-common-hook 'my-style-selective-mode-hook-function)

disk space allocation (part 4: allocating an extent)

For XFS, in normal operation, an extent is only allocated when data has to be written to disk. This is called delayed allocation. If we are extending a file by 50MB – that space is deducted from the total free space on the filesystem, but no decision on where to place that data is made until we start writing it out – due to memory pressure or the kernel automatically starts writing the dirty pages out (the sync once every 5 seconds on linux).

When an extent needs to be allocated, XFS looks it up in one of two b+trees it has of free space. There is one sorted by starting block number (so you can search for “an extent near here”) and one by size (so you can search for “an extent of x size”).

The ideal situation being that you want as large an extent as possible as close to the tail end of the file as possible (i.e. just making the current extent bigger).

The worst-case scenario is having to allocate extents to multiple files at once with all of them being written out synchronously (O_SYNC or memory pressure) as this will cause lots of small extents to be created.

disk space allocation (part 3: storing extents on disk)

Here I’m going to talk about how file systems store what part of the disk a part of the file occupies. If your database files are very fragmented, performance will suffer. How much depends on a number of things however.

XFS can store some extents directly in the inode (see xfs_dinode.h). If I’m reading things correctly, this can be 2 extents per fork (data fork and attribute fork). If more than this number of extents are needed, a btree is used instead.

HFS/HFS+ can store up to 8 extents directly in the catalog file entry (see Apple TechNote 1150 – which was updated in March 2004 with information on the journal format). If the file has more than 8 extents, a lookup then needs to be done into the extents overflow file. Interestingly enough, in MacOS X 10.4 and above (i think it was 10.4… may have been 10.3 as well) if a file is less than 20MB and has more than 8 extents, on an open, the OS will automatically try to defragment that file. Arguably you should just fix your allocation strategy, but hey – maybe this does actually help.

File systems such as ext2, ext3 and reiserfs just store a list of block numbers. In the case of ext2 and ext3, the futher into a file you are, the more steps are required to find the disk block number associated with that block in the file.

So what does an extent actually look like? Well, for XFS, the following excerpt from xfs_bmap_btree.h is interesting:

#define ISUNWRITTEN(x) ((x)->br_state == XFS_EXT_UNWRITTEN)

typedef struct xfs_bmbt_irec
{
xfs_fileoff_t br_startoff; /* starting file offset */
xfs_fsblock_t br_startblock; /* starting block number */
xfs_filblks_t br_blockcount; /* number of blocks */
xfs_exntst_t br_state; /* extent state */
} xfs_bmbt_irec_t;

It’s also rather self explanetry. Holes (for sparse files) in XFS don’t have extents, and an extent doesn’t have to have been written to disk. This allows you to preallocate space in chunks without having written anything to it. Reading from an unwritten extent gets you zeros (otherwise it would be a security hole!).

Sweden!

I’m in the Stockholm office at the moment. on the network, grabbing mail and all that foo.

Spent yesterday in London with Leandra, finally having arrived after the plane was delayed for four hours in Melbourne. Urgh.

There’s snow! it’s cool.

will have to check how photos look at some point.

and oh, good to be back in europe – good jam.

disk space allocation (part 2: examining your database files)

memberdb/log.MYD:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
   0: [0..943]:        5898248..5899191  3 (36536..37479)     944
   1: [944..1023]:     6071640..6071719  3 (209928..210007)    80
   2: [1024..1127]:    6093664..6093767  3 (231952..232055)   104
   3: [1128..1279]:    6074800..6074951  3 (213088..213239)   152
   4: [1280..1407]:    6074672..6074799  3 (212960..213087)   128
   5: [1408..1423]:    6074264..6074279  3 (212552..212567)    16
memberdb/log.MYI:
 EXT: FILE-OFFSET      BLOCK-RANGE        AG AG-OFFSET        TOTAL
   0: [0..7]:          10165832..10165839  5 (396312..396319)     8

The interesting thing about this is that the log table grows very slowly. This table stores a bunch of debugging output for my memberdb applicaiton. It should possibly be a partitioned ARCHIVE table (and probably will in the future).

The thing about a file growing slowly over time is that it’s more likely to have more than 1 extent (I’ll examine why in the near future).

My InnoDB data and log files only have 1 extent.. I think I’ve done a xfs_fsr on my file system though.

disk space allocation (part 1: seeing what’s happenned)

(a little while ago I was writing a really long entry on everything possible. I realised that this would be a long read for people and that less people would look at it, so I’ve split it up).

This sprung out of doing work on the NDB disk data tree. Anything where efficient use of the filesystem is concerned tickles my fancy, so I went to have a look at what was going on.

Filesystems store what part of the disk belongs to what file in one of two ways. The first is to keep a list of every disk block (typically 4kb) that’s being used by the file. A 400kb file will have 100 block numbers. The second way is to store a range (extent). That is, a 400kb file could use 100 blocks starting at disk block number 1000.

XFS has a tool called xfs_bmap. It gives you a list of the extents allocated to a file.

So, let’s have a look at what it tells us about some recordings on my MythTV box.

myth@orpheus:~$ ls -lah myth-recordings/10_20050912183000_20050912190000.nuv
 -rw-r--r--  1 myth myth 452M 2005-09-12 19:00 myth-recordings/10_20050912183000_20050912190000.nuv
myth@orpheus:~$ xfs_bmap -v myth-recordings/10_20050912183000_20050912190000.nuv
myth-recordings/10_20050912183000_20050912190000.nuv:
 EXT: FILE-OFFSET       BLOCK-RANGE          AG AG-OFFSET             TOTAL
   0: [0..639]:         228712176..228712815  7 (21106232..21106871)    640
   1: [640..1663]:      83674040..83675063    2 (24358056..24359079)   1024
   2: [1664..923519]:   83675368..84597223    2 (24359384..25281239) 921856
   3: [923520..924031]: 84631272..84631783    2 (25315288..25315799)    512

Just to make things fun, this is all in 512byte blocks. But anyway, the real interesting thing is the number of extents. Ideally, every file would have one extent as this means that we avoid disk seeks – *the* most expensive disk operation.

XFS also provides the xfs_fsr tool (File System Repacker) that can defragment files (even on a mounted file system). On IRIX this used to run out of cron – fun when a bunch of machines hit a CXFS volume all at the same time.

Foot High In Tray

When the In tray gets to be over a foot high, you know it’s time to actually go through it.

It’s small again now – less than an hour (while doing other things).

I should really move to interrupt mode instead of batch mode.

Although most of the things that go into my in tray really only need filing, i’ve already taken the action.

=?ISO-8859-1?Q?FlamingMob._An_attempt_at_mobloging.?=

—-_MULTIPART_BOUNDARY_00005C81000072CC
Content-Type: TEXT/PLAIN;
charset=”UTF-8″
Content-Transfer-Encoding: BASE64

SSB3b25kZXIgaWYgdGhpcyB3aWxsIHNob3cgdXA6KQ==
—-_MULTIPART_BOUNDARY_00005C81000072CC
Content-Type: IMAGE/JPEG;
charset=”US-ASCII”;
name=”Picture(1).jpg”
Content-Disposition: attachment;
filename=”Picture(1).jpg”
Content-Transfer-Encoding: BASE64

/9j/4AAQSkZJRgABAgAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkz
ODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2Nj
Y2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY//EAaIAAAEFAQEBAQEBAAAA
AAAAAAABAgMEBQYHCAkKCxAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEI
I0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlq
c3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW
19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6AQADAQEBAQEBAQEBAAAAAAAAAQIDBAUGBwgJCgsRAAIB
AgQEAwQHBQQEAAECdwABAgMRBAUhMQYSQVEHYXETIjKBCBRCkaGxwQkjM1LwFWJy0QoWJDThJfEX
GBkaJicoKSo1Njc4OTpDREVGR0hJSlNUVVZXWFlaY2RlZmdoaWpzdHV2d3h5eoKDhIWGh4iJipKT
lJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uLj5OXm5+jp6vLz9PX2
9/j5+v/AABEIAHgAoAMBIQACEQEDEQH/2gAMAwEAAhEDEQA/ANlBinZqyxDUMp5FAHNMwzTdwoZs
hCabU3KCg9DihCZ18QxGBSmixzCGmmkA5aalA0PpaBgvX8aewzQiWRCjNUihCagnbCk+gzTEcwWN
ML0G4b+9A+ZgAKmxQrNtyApHvilikRnUHA5pktnYRkGMEdDSmkc4hppoAM4U/ShKAQ7NGaQwU/MK
moEyuDQTTKEqrenEEp9EP8qoXU5sj3ppHv8ApQboMHHWkYlVJHLdqCiwnyRqroSTyTTZI0wxUfMC
KBvY6mzUpaRBuu3mpSaRyvcQ00mkIG+6aE6UAOpKQCofmFS5piZADSZqigzVPUf+PaX/AHDQCOaY
+9M5x1pm6JY4J5ELJG7LnGQpNERPmKOuT3oGnqWJnffxj8antcPJGJFB3OARQy5PQ6JT8ooJqTjE
JppPNAAx+WnL0oAXNJmkARnkVLmgTIaSqGFVL5S8EqqOSuBTGjBNlP2j/UUhspwOIyfpSVzXmQ+K
OSMYdGX0FW7XTt8bXIkBCH7mOaZSSTuQy24kOckVZtrR1uIZM5Qdj1FMc3obQPyjFGaRzCU2pGK3
QfWnL0oAWkPQ0CFTqafQSyIHijNUUwqJj+8oAdkU5EV2wQMdTxQBnarBvn3xx5ULj5RjFLpVwscp
Rz8jjBzTN1rGxakjQSEoBjPBoCjgn1oM2SoOaXNBAmabnmpAUnpUg6UhhSHpQIdH1NOpEkIpaoth
UMn3jTEhAanXKxg+tMEVzIfNx+lQ3VksS+bCcHPKf4UGkXZkMN0QNpPFW0lQgYOB1oNZRvqidHRv
usDQOaDCSa3D8KZ3pEjh1/CpKllIKQ0hMfH0NOpkES9KWgsKiKM7lUGTTCwv2aUfeXA7nPSpztMZ
YMCAOxp3C1jLdz9oG3jnj2qUQTSNgBm96DTYj1GwNtEJQw3fxL/hWfHPnjPNBrCV0XLViJFbGcGu
ijZSoIAxUtk1NWP4PYUxo0bqoP4UrmVipcIqSYUYyKaaYhKQ9aQiWMfJS0iGRL0paZoFLaj/AEr8
6dwRoYrC1CJHvXCsYzx92ktzWmrswdae4tWjxKec4I4JrNGqXiHid/wqxSVmK+rXkmN87Mffmo/t
05OSwz9BVAi3Brd7Cm1JQB6bRViHX79fuzAe2wVLSB3Or0We5uLET3LAlydvGOKvgknmsibFS55n
/SmmqRLG0h60CJ0HyClIqSWVx0pc1ZYZpbY4uCc460mCLbOwGQc1h6i8ZaSUP5bKMkHvSRtF2OY1
W5+0eXySFyKzj0rVGcnd3E5HpSjPtVXQJj1qzaxGWZI16sQKlsNz0G1aOOJI1+6igAVa3Buc1gNl
KU5nb6/0pM1RIhNMJpksBfxqACDjpmpUuopDhXBPpUiZGOlLVlBTY9plO4ZFJ7DW5PsUD5H/AArk
tckdtQkjY8IR06dKIblNmZLC7xoyo205wcdageF0HzKR9RWiYWEwD3FLgUC0HrGT0Gfwq7Ybre6j
crznAyKTGdNuu45Pkj3qRkZFWbe7nIxJbFW7c8VndJEtNscGZmJYYOeRS0AJTT3z0oJK86xFW2rg
+oqlHLhgAAD7UkBrA8UZqxsM1JbxuGLrjHTmgB7RSE5Uqp9jway9T0OW9lEgkjVsYPvSWjHcc2jy
lFUFMKMCoZtDnkjK5j596LmntE0ZzeE7rPEkWPqadH4WuVbLNGfxquYz0L8OjTRLgKn4Gll0m4ZR
tVMggjJpXNOdWsbURZF5TB9qVpHxwprKxFyrggncMHrQTVCEzTSNwxnGaLkiG0jIILnmok02BOjn
86EInFFVcoSnB2A4Y/nTEL5j/wB4/nR5r/3j+dAwEr/3j+dL5r/3jUjF81/7xo81/wC8aAsHnP8A
3qXzn/vUCsHnP/eo85/WgBjOW602kAlNPQUhBSg0wBTkUuaChM0ZqkIM0UAGaXNAxc0ZqQDNGaAD
NGaBCZpCaQhM8UhPIpAJketG9R/EKYCRtxUlUymJUReTcQqAj1zSAXdL/wA8v/HhRvk/55H8xVCF
3v8A88j+Yo3v/wA8m/MUBcPMf/nk350u9/8Ank350gDe/wDzyb86N7/88m/OgLhvf/nkfzpDI4/5
ZH86Qh6kkZIwaKAuIaPLVzlscUguL5MQ7r+VNYRL/wDqppCuf//Z
—-_MULTIPART_BOUNDARY_00005C81000072CC–