{"id":1931,"date":"2010-05-26T23:46:32","date_gmt":"2010-05-26T13:46:32","guid":{"rendered":"http:\/\/www.flamingspork.com\/blog\/?p=1931"},"modified":"2010-05-26T23:46:32","modified_gmt":"2010-05-26T13:46:32","slug":"blobs-in-the-drizzlemysql-storage-engine-api","status":"publish","type":"post","link":"https:\/\/www.flamingspork.com\/blog\/2010\/05\/26\/blobs-in-the-drizzlemysql-storage-engine-api\/","title":{"rendered":"BLOBS in the Drizzle\/MySQL Storage Engine API"},"content":{"rendered":"<p>Another (AFAIK) undocumented part of the Storage Engine API:<\/p>\n<p>We all know what a normal row looks like in Drizzle\/MySQL row format (a NULL bitmap and then column data):<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect28161.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1933\" data-permalink=\"https:\/\/www.flamingspork.com\/blog\/2010\/05\/26\/blobs-in-the-drizzlemysql-storage-engine-api\/rect2816-2\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect28161.png?fit=534%2C120&amp;ssl=1\" data-orig-size=\"534,120\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;}\" data-image-title=\"row with null bitmap\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect28161.png?fit=300%2C67&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect28161.png?fit=534%2C120&amp;ssl=1\" class=\"aligncenter size-medium wp-image-1933\" title=\"row with null bitmap\" src=\"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect28161-300x67.png?resize=300%2C67\" alt=\"\" width=\"300\" height=\"67\" srcset=\"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect28161.png?resize=300%2C67&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect28161.png?w=534&amp;ssl=1 534w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>Nothing that special. It&#8217;s a fixed sized buffer, Field objects reference into it, you read out of it and write the values into your engine. However, when you get to BLOBs, we can&#8217;t use a fixed sized buffer as BLOBs may be quite large. So, the format with BLOBS is the bit in the row is a length of the blob (1, 2, 3 or 4 bytes &#8211; in Drizzle it&#8217;s only 3 or 4 bytes now and soon only 4 bytes once we fix a bug that isn&#8217;t interesting to discuss here). The Second part of the in-row part is a pointer to a location in memory where the BLOB is stored. So a row that has a BLOB in it looks something like this:<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect2816.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1932\" data-permalink=\"https:\/\/www.flamingspork.com\/blog\/2010\/05\/26\/blobs-in-the-drizzlemysql-storage-engine-api\/rect2816\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect2816.png?fit=609%2C439&amp;ssl=1\" data-orig-size=\"609,439\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;}\" data-image-title=\"Row with BLOB\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect2816.png?fit=300%2C216&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect2816.png?fit=584%2C421&amp;ssl=1\" class=\"aligncenter size-medium wp-image-1932\" title=\"Row with BLOB\" src=\"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect2816-300x216.png?resize=300%2C216\" alt=\"\" width=\"300\" height=\"216\" srcset=\"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect2816.png?resize=300%2C216&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/05\/rect2816.png?w=609&amp;ssl=1 609w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>The size of the pointer is (of course) platform dependent. On 32bit machines it&#8217;s 4 bytes and on 64bit machines it&#8217;s 8 bytes.<\/p>\n<p>Now, if I were any other source of documentation, I&#8217;d stop right here.<\/p>\n<p>But I&#8217;m not. I&#8217;m a programmer writing a Storage Engine who now has the crucial question of memory management.<\/p>\n<p>When your engine is given the row from the upper layer (such as doInsertRecord()\/write_row()) you don&#8217;t have to worry, for the duration of the call, the memory will be there (don&#8217;t count on it being there after though, so if you&#8217;re not going to immediately splat it somewhere, make your own copy).<\/p>\n<p>For reading, you are expected to provide a pointer to a location in memory that is valid until the next call to your Cursor. For example, rnd_next() call reads a BLOB field and your engine provides a pointer. At the subsequent rnd_next() call, it can free that pointer (or at doStopTableScan()\/rnd_end()).<\/p>\n<p>HOWEVER, this is true <strong>except<\/strong> for \u00ef\u00bb\u00bfindex_read_idx_map(), which in the default implementation in the Cursor (handler) base class ends up doing a doStartIndexScan(), index_read(), doEndIndexScan(). This means that if a BLOB was read, the engine could have (quite rightly) freed that memory already. In this case, you must keep the memory around until either a reset() or extra(HA_EXTRA_FLUSH) call.<\/p>\n<p>This exception is tested (by accident) by a whole single query in type_blob.test &#8211; a monster of a query that&#8217;s about a seven way join with a group by and an order by. It would be quite possible to write a fairly functional engine and <strong>completely<\/strong> miss this.<\/p>\n<p>Good luck.<\/p>\n<p>This blog post (but not the whole blog) is published under the\u00c2\u00a0<a href=\"http:\/\/creativecommons.org\/licenses\/by-sa\/3.0\/\">Creative Commons Attribution-Share Alike License<\/a>. Attribution is by linking back to this post and mentioning my name (Stewart Smith).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Another (AFAIK) undocumented part of the Storage Engine API: We all know what a normal row looks like in Drizzle\/MySQL row format (a NULL bitmap and then column data): Nothing that special. It&#8217;s a fixed sized buffer, Field objects reference &hellip; <a href=\"https:\/\/www.flamingspork.com\/blog\/2010\/05\/26\/blobs-in-the-drizzlemysql-storage-engine-api\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[75,14],"tags":[309,70,628,80],"class_list":["post-1931","post","type-post","status-publish","format-standard","hentry","category-drizzle-work-et-al","category-mysql","tag-blob","tag-drizzle","tag-mysql","tag-storage-engine-api"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p5a6n8-v9","jetpack-related-posts":[{"id":1876,"url":"https:\/\/www.flamingspork.com\/blog\/2010\/04\/02\/the-drizzle-and-mysql-key-tuple-format\/","url_meta":{"origin":1931,"position":0},"title":"The Drizzle (and MySQL) Key tuple format","author":"Stewart Smith","date":"2010-04-02","format":false,"excerpt":"Here's something that's not really documented anywhere (unless you count ha_innodb.cc as a source of server documentation). You may have some idea about the MySQL\/Drizzle row buffer format. This is passed around the storage engine interface: in for write_row and update_row and out for the various scan and index read\u2026","rel":"","context":"In &quot;drizzle&quot;","block_context":{"text":"drizzle","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/drizzle-work-et-al\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2303,"url":"https:\/\/www.flamingspork.com\/blog\/2011\/03\/16\/fixed-in-drizzle-no-more-gotchas\/","url_meta":{"origin":1931,"position":1},"title":"Fixed in Drizzle: No more &#8220;GOTCHA&#8217;s&#8221;","author":"Stewart Smith","date":"2011-03-16","format":false,"excerpt":"\u00a0 At the upcoming MySQL Conference and Expo, I'm going to give a Thursday afternoon (2pm) session entitled Fixed in Drizzle: No more \"GOTCHA's\". I plan to have a lot of fun with this session.. If you go back to the very start of when I started submitting code to\u2026","rel":"","context":"In &quot;drizzle&quot;","block_context":{"text":"drizzle","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/drizzle-work-et-al\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2168,"url":"https:\/\/www.flamingspork.com\/blog\/2010\/10\/25\/storage-engine-api-state-graph\/","url_meta":{"origin":1931,"position":2},"title":"Storage Engine API state graph","author":"Stewart Smith","date":"2010-10-25","format":false,"excerpt":"Drizzle still has a number of quirks inherited from the MySQL Storage Engine API (e.g. BLOBs, row buffer, CREATE SELECT and lack of DDL transaction boundaries, key tuple format). One of the things we fixed a long time ago was to have proper methods for StorageEngines to be called for:\u2026","rel":"","context":"In &quot;code&quot;","block_context":{"text":"code","link":"https:\/\/www.flamingspork.com\/blog\/category\/code\/"},"img":{"alt_text":"State transitions for a transaction. Transaction can be empty OR have one or more statements","src":"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/10\/edge30-300x163.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":1409,"url":"https:\/\/www.flamingspork.com\/blog\/2009\/02\/20\/improving-the-storage-engine-api\/","url_meta":{"origin":1931,"position":3},"title":"Improving the Storage Engine &#8220;API&#8221;","author":"Stewart Smith","date":"2009-02-20","format":false,"excerpt":"I increasingly enclose the API part of \"Storage Engine API\" in quotes as it does score a rather large number on the API Design Rusty levels (Coined by Rusty Russell). I give it a 15 (out of 18. lower is better) in this case \"The obvious use is wrong\". The\u2026","rel":"","context":"In &quot;code&quot;","block_context":{"text":"code","link":"https:\/\/www.flamingspork.com\/blog\/category\/code\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1879,"url":"https:\/\/www.flamingspork.com\/blog\/2010\/05\/24\/using-the-row-buffer-in-drizzle-and-mysql\/","url_meta":{"origin":1931,"position":4},"title":"Using the row buffer in Drizzle (and MySQL)","author":"Stewart Smith","date":"2010-05-24","format":false,"excerpt":"Here's another bit of the API you may need to use in your storage engine (it also seems to be a rather unknown. I believe the only place where this has really been documented is ha_ndbcluster.cc, so here goes.... Drizzle (through inheritance from MySQL) has its own (in memory) row\u2026","rel":"","context":"In &quot;drizzle&quot;","block_context":{"text":"drizzle","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/drizzle-work-et-al\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2010\/04\/row-300x85.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":3291,"url":"https:\/\/www.flamingspork.com\/blog\/2013\/04\/18\/where-are-they-now-mysql-storage-engines\/","url_meta":{"origin":1931,"position":5},"title":"Where are they now: MySQL Storage Engines","author":"Stewart Smith","date":"2013-04-18","format":false,"excerpt":"There was once a big hooplah about the MySQL Storage Engine Architecture and how it was easy to just slot in some other method of storage instead of the provided ones. Over the years I've repeatedly mentioned how this wasn't really the case and that it was remarkably non trivial.\u2026","rel":"","context":"In &quot;code&quot;","block_context":{"text":"code","link":"https:\/\/www.flamingspork.com\/blog\/category\/code\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/1931","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/comments?post=1931"}],"version-history":[{"count":1,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/1931\/revisions"}],"predecessor-version":[{"id":1934,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/1931\/revisions\/1934"}],"wp:attachment":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/media?parent=1931"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/categories?post=1931"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/tags?post=1931"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}