{"id":514,"date":"2005-11-29T02:10:12","date_gmt":"2005-11-28T16:10:12","guid":{"rendered":"http:\/\/www.flamingspork.com\/blog\/?p=514"},"modified":"2005-11-29T02:10:12","modified_gmt":"2005-11-28T16:10:12","slug":"disk-space-allocation-part-3-storing-extents-on-disk","status":"publish","type":"post","link":"https:\/\/www.flamingspork.com\/blog\/2005\/11\/29\/disk-space-allocation-part-3-storing-extents-on-disk\/","title":{"rendered":"disk space allocation (part 3: storing extents on disk)"},"content":{"rendered":"<p>Here I&#8217;m going to talk about how file systems store what part of the disk a part of the file occupies. If your database files are very fragmented, performance will suffer. How much depends on a number of things however.<\/p>\n<p>XFS can store some extents directly in the inode (see xfs_dinode.h). If I&#8217;m reading things correctly, this can be 2 extents per fork (data fork and attribute fork). If more than this number of extents are needed, a btree is used instead.<\/p>\n<p>HFS\/HFS+ can store up to 8 extents directly in the catalog file entry (see <a href=\"http:\/\/developer.apple.com\/technotes\/tn\/tn1150.html#ForkDataStructure\">Apple TechNote 1150<\/a> &#8211; which was updated in March 2004 with information on the journal format). If the file has more than 8 extents, a lookup then needs to be done into the extents overflow file. Interestingly enough, in MacOS X 10.4 and above (i think it was 10.4&#8230; may have been 10.3 as well) if a file is less than 20MB and has more than 8 extents, on an open, the OS will automatically try to defragment that file. Arguably you should just fix your allocation strategy, but hey &#8211; maybe this does actually help.<\/p>\n<p>File systems such as ext2, ext3 and reiserfs just store a list of block numbers. In the case of ext2 and ext3, the futher into a file you are, the more steps are required to find the disk block number associated with that block in the file.<\/p>\n<p>So what does an extent actually look like? Well, for XFS, the following excerpt from xfs_bmap_btree.h is interesting:<br \/>\n<code><br \/>\n#define ISUNWRITTEN(x)\t((x)->br_state == XFS_EXT_UNWRITTEN)<\/p>\n<p>typedef struct xfs_bmbt_irec<br \/>\n{<br \/>\n\txfs_fileoff_t\tbr_startoff;\t\/* starting file offset *\/<br \/>\n\txfs_fsblock_t\tbr_startblock;\t\/* starting block number *\/<br \/>\n\txfs_filblks_t\tbr_blockcount;\t\/* number of blocks *\/<br \/>\n\txfs_exntst_t\tbr_state;\t\/* extent state *\/<br \/>\n} xfs_bmbt_irec_t;<br \/>\n<\/code><\/p>\n<p>It&#8217;s also rather self explanetry. Holes (for sparse files) in XFS don&#8217;t have extents, and an extent doesn&#8217;t have to have been written to disk. This allows you to preallocate space in chunks without having written anything to it. Reading from an unwritten extent gets you zeros (otherwise it would be a security hole!).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Here I&#8217;m going to talk about how file systems store what part of the disk a part of the file occupies. If your database files are very fragmented, performance will suffer. How much depends on a number of things however. &hellip; <a href=\"https:\/\/www.flamingspork.com\/blog\/2005\/11\/29\/disk-space-allocation-part-3-storing-extents-on-disk\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[8,14],"tags":[],"class_list":["post-514","post","type-post","status-publish","format-standard","hentry","category-linux-kernel","category-mysql"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p5a6n8-8i","jetpack-related-posts":[{"id":759,"url":"https:\/\/www.flamingspork.com\/blog\/2006\/11\/13\/disk-allocation-xfs-ndb-disk-data-and-more\/","url_meta":{"origin":514,"position":0},"title":"Disk allocation, XFS, NDB Disk Data and more&#8230;","author":"Stewart Smith","date":"2006-11-13","format":false,"excerpt":"I've talked about disk space allocation previously, mainly revolving around XFS (namely because it's what I use, a sensible choice for large file systems and large files and has a nice suite of tools for digging into what's going on).Most people write software that just calls write(2) (or libc things\u2026","rel":"","context":"In &quot;linux-kernel&quot;","block_context":{"text":"linux-kernel","link":"https:\/\/www.flamingspork.com\/blog\/category\/linux-kernel\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":515,"url":"https:\/\/www.flamingspork.com\/blog\/2005\/11\/29\/disk-space-allocation-part-4-allocating-an-extent\/","url_meta":{"origin":514,"position":1},"title":"disk space allocation (part 4: allocating an extent)","author":"Stewart Smith","date":"2005-11-29","format":false,"excerpt":"For XFS, in normal operation, an extent is only allocated when data has to be written to disk. This is called delayed allocation. If we are extending a file by 50MB - that space is deducted from the total free space on the filesystem, but no decision on where to\u2026","rel":"","context":"In &quot;linux-kernel&quot;","block_context":{"text":"linux-kernel","link":"https:\/\/www.flamingspork.com\/blog\/category\/linux-kernel\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":511,"url":"https:\/\/www.flamingspork.com\/blog\/2005\/11\/23\/disk-space-allocation-part-1-seeing-whats-happenned\/","url_meta":{"origin":514,"position":2},"title":"disk space allocation (part 1: seeing what&#8217;s happenned)","author":"Stewart Smith","date":"2005-11-23","format":false,"excerpt":"(a little while ago I was writing a really long entry on everything possible. I realised that this would be a long read for people and that less people would look at it, so I've split it up). This sprung out of doing work on the NDB disk data tree.\u2026","rel":"","context":"In &quot;linux-kernel&quot;","block_context":{"text":"linux-kernel","link":"https:\/\/www.flamingspork.com\/blog\/category\/linux-kernel\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":578,"url":"https:\/\/www.flamingspork.com\/blog\/2006\/02\/15\/information_schemafiles-querying-disk-usage-from-sql\/","url_meta":{"origin":514,"position":3},"title":"INFORMATION_SCHEMA.FILES (querying disk usage from SQL)","author":"Stewart Smith","date":"2006-02-15","format":false,"excerpt":"In MySQL 5.1.6 there's a new INFORMATION_SCHEMA table. Currently, it only has information on files for NDB but we're hoping to change that in a future release (read: I think it would be neat). This table is a table generated by the MySQL server listing all the different files that\u2026","rel":"","context":"In &quot;mysql&quot;","block_context":{"text":"mysql","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/mysql\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":861,"url":"https:\/\/www.flamingspork.com\/blog\/2007\/07\/12\/reading-maildirs-fast\/","url_meta":{"origin":514,"position":4},"title":"reading maildirs&#8230;. fast&#8230;","author":"Stewart Smith","date":"2007-07-12","format":false,"excerpt":"So, for a side project i'm hacking on, i'm wanting to read in Maildirs really fast (and then pump them into something else... for current purposes I'm just putting everything in one file.. getting the read speed up is of current importance). I've done a bit of experimenting and my\u2026","rel":"","context":"In &quot;General&quot;","block_context":{"text":"General","link":"https:\/\/www.flamingspork.com\/blog\/category\/general\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":478,"url":"https:\/\/www.flamingspork.com\/blog\/2005\/10\/03\/a-funky-thing-done-last-week\/","url_meta":{"origin":514,"position":5},"title":"a funky thing done last week&#8230;","author":"Stewart Smith","date":"2005-10-03","format":false,"excerpt":"still have to talk to people about standards for this sort of thing and all that. But as a first checkin - funkyness++! mysql> select * from INFORMATION_SCHEMA.DATAFILES; select * from INFORMATION_SCHEMA.TABLESPACES; Empty set (0.03 sec) Empty set (0.00 sec) mysql> CREATE TABLESPACE ts1 ADD DATAFILE 'datafile.dat' USE LOGFILE GROUP\u2026","rel":"","context":"In &quot;mysql&quot;","block_context":{"text":"mysql","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/mysql\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/514","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/comments?post=514"}],"version-history":[{"count":0,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/514\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/media?parent=514"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/categories?post=514"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/tags?post=514"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}