{"id":2223,"date":"2010-12-06T11:01:39","date_gmt":"2010-12-06T01:01:39","guid":{"rendered":"http:\/\/www.flamingspork.com\/blog\/?p=2223"},"modified":"2010-12-06T11:01:39","modified_gmt":"2010-12-06T01:01:39","slug":"replication-log-inside-innodb","status":"publish","type":"post","link":"https:\/\/www.flamingspork.com\/blog\/2010\/12\/06\/replication-log-inside-innodb\/","title":{"rendered":"Replication log inside InnoDB"},"content":{"rendered":"<p>The MySQL replication system has always had the replication log (&#8220;binlog&#8221;) as a separate set of files on disk. Originally, this really didn&#8217;t matter as, well, MyISAM wasn&#8217;t transactional or crash safe so the binlog didn&#8217;t need to be either. If you crashed on a busy write workload, your replication was just going to be hosed anyway.<\/p>\n<p>So then came a time where everybody used InnoDB. Transactional, crash-safe and all the goodies. Then, a bit later, came storing master rpl log position in InnoDB log and XA with the binlog. So a rather long time after MySQL first had replication, you could pull the power cord on the master with a decent amount of certainty that things would be okay when you turned it on again.<\/p>\n<p>I am, of course, totally ignoring the slave state and if it&#8217;s safe to do that on slaves.<\/p>\n<p>Using XA to make the binlog and InnoDB consistent does have a cost. That cost is fsync()s. You have to do a lot more of them (two phase commit here).<\/p>\n<p>As you may be aware, at a (much) earlier point in Drizzle we completely ripped out the replication code. Why? A lot of it was very much still geared to support statement based replication &#8211; something we certainly didn&#8217;t want to support. We also did not really want to keep the legacy binlog format. We wanted it to be very, very pluggable.<\/p>\n<p>So the initial implementation is a transaction log file. Basically, we write out the replication messages to a file. A slave reads this and applies the operations. Pretty simple and foolproof to implement.<\/p>\n<p>But it&#8217;s pluggable.<\/p>\n<p>What if we stored the transaction log inside innodb? Not only that, what if we wrote it as part of the transaction that was doing the changes? That way, no XA is needed &#8211; everything is consistent with a COMMIT. This would greatly reduce the number of fsync()s needed to be consistent.<\/p>\n<p>Now&#8230; the first thing people will say is &#8220;arrggh! You&#8217;re writing the data *<strong>four<\/strong>* times now&#8221;. First being the txn data into the log, then the replication log into the log, and then both of these are written back to the data file. It turns out that this is <strong>much<\/strong> cheaper than doing the additional fsync()s.<\/p>\n<p>In one of our tests, the file based transaction log: ~300tps. Transaction log in InnoDB: ~1200tps.<\/p>\n<p>I think that&#8217;s an acceptable trade-off.<\/p>\n<p>We&#8217;ve just merged the first bit of this work into Drizzle.<\/p>\n<p>Props go to Joe Daly, Brian and myself for making it work.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The MySQL replication system has always had the replication log (&#8220;binlog&#8221;) as a separate set of files on disk. Originally, this really didn&#8217;t matter as, well, MyISAM wasn&#8217;t transactional or crash safe so the binlog didn&#8217;t need to be either. &hellip; <a href=\"https:\/\/www.flamingspork.com\/blog\/2010\/12\/06\/replication-log-inside-innodb\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[75,14],"tags":[],"class_list":["post-2223","post","type-post","status-publish","format-standard","hentry","category-drizzle-work-et-al","category-mysql"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p5a6n8-zR","jetpack-related-posts":[{"id":3664,"url":"https:\/\/www.flamingspork.com\/blog\/2014\/02\/04\/ghosts-of-mysql-past-part-2\/","url_meta":{"origin":2223,"position":0},"title":"Ghosts of MySQL Past: Part 2","author":"Stewart Smith","date":"2014-02-04","format":false,"excerpt":"This continues on from my post yesterday and also contains content from my linux.conf.au 2014 talk (view video here). Way back in May in the year 2000, a feature was added to MySQL that would keep many people employed for many years - replication. In 3.23.15 you could replicate from\u2026","rel":"","context":"In &quot;drizzle&quot;","block_context":{"text":"drizzle","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/drizzle-work-et-al\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3801,"url":"https:\/\/www.flamingspork.com\/blog\/2014\/09\/19\/mysql-architecture\/","url_meta":{"origin":2223,"position":1},"title":"Some current MySQL Architecture writings","author":"Stewart Smith","date":"2014-09-19","format":false,"excerpt":"So, I've been looking around for a while (and a few times now) for any good resources that cover a bunch of MySQL architecture and technical details aimed towards the technically proficient but not MySQL literate audience. I haven't really found anything. I mean, there's the (huge and very detailed)\u2026","rel":"","context":"In &quot;mysql&quot;","block_context":{"text":"mysql","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/mysql\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":914,"url":"https:\/\/www.flamingspork.com\/blog\/2007\/11\/09\/mysql-5122-stew2\/","url_meta":{"origin":2223,"position":2},"title":"mysql-5.1.22-stew2","author":"Stewart Smith","date":"2007-11-09","format":false,"excerpt":"New: Updated NDB Compressed LCP and BACKUP patches (now with O_DIRECT support) InnoDB patch for Windows that should give ~5x improvement on commits\/sec (Bug31876) Everything in current telco-6.3 tree (ndb ~6.3.5) Lots of NDB improvements and new features over regular 5.1. WL3686 Remove read before update WL2680 NDB Batched Update\u2026","rel":"","context":"In &quot;mysql&quot;","block_context":{"text":"mysql","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/mysql\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1655,"url":"https:\/\/www.flamingspork.com\/blog\/2009\/06\/09\/drizzle-pluggable-metadatastore-or-no-table-definition-file-on-disk\/","url_meta":{"origin":2223,"position":3},"title":"Drizzle pluggable MetadataStore (or: no table definition file on disk)","author":"Stewart Smith","date":"2009-06-09","format":false,"excerpt":"My code is shaping up rather nicely (see https:\/\/code.launchpad.net\/~stewart\/drizzle\/discovery) and I'm planning to submit a merge-request for it later today. I'm about to commit code that implements a MetadataStore for the ARCHIVE engine. This means that for ARCHIVE tables, you only have the .ARZ file on disk. The table definition\u2026","rel":"","context":"In &quot;General&quot;","block_context":{"text":"General","link":"https:\/\/www.flamingspork.com\/blog\/category\/general\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3345,"url":"https:\/\/www.flamingspork.com\/blog\/2013\/05\/23\/mysql-vs-drizzle-plugin-api\/","url_meta":{"origin":2223,"position":4},"title":"MySQL vs Drizzle plugin APIs","author":"Stewart Smith","date":"2013-05-23","format":false,"excerpt":"There's a big difference in how plugins are treated in MySQL and how they are treated in Drizzle. The MySQL way has been to create a C API in front of the C++-like (I call it C- as it manages to take the worst of both worlds) internal \"API\". The\u2026","rel":"","context":"In &quot;code&quot;","block_context":{"text":"code","link":"https:\/\/www.flamingspork.com\/blog\/category\/code\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3304,"url":"https:\/\/www.flamingspork.com\/blog\/2013\/05\/13\/the-mysql-cluster-storage-engine\/","url_meta":{"origin":2223,"position":5},"title":"The MySQL Cluster storage engine","author":"Stewart Smith","date":"2013-05-13","format":false,"excerpt":"This is one close to my heart. I've recently written on other storage engines:\u00c2\u00a0Where are they now: MySQL Storage Engines,\u00c2\u00a0The MERGE storage engine: not dead, just resting\u00e2\u20ac\u00a6. or forgotten\u00c2\u00a0and The MEMORY storage engine. Today, it's the turn of MySQL Cluster. Like InnoDB, MySQL Cluster started outside of MySQL. Those of\u2026","rel":"","context":"In &quot;code&quot;","block_context":{"text":"code","link":"https:\/\/www.flamingspork.com\/blog\/category\/code\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/2223","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/comments?post=2223"}],"version-history":[{"count":2,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/2223\/revisions"}],"predecessor-version":[{"id":2225,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/2223\/revisions\/2225"}],"wp:attachment":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/media?parent=2223"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/categories?post=2223"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/tags?post=2223"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}