{"id":3755,"date":"2014-06-03T14:01:35","date_gmt":"2014-06-03T04:01:35","guid":{"rendered":"https:\/\/www.flamingspork.com\/blog\/?p=3755"},"modified":"2014-06-03T15:29:38","modified_gmt":"2014-06-03T05:29:38","slug":"mysql-5-6-performance-on-power8","status":"publish","type":"post","link":"https:\/\/www.flamingspork.com\/blog\/2014\/06\/03\/mysql-5-6-performance-on-power8\/","title":{"rendered":"MySQL 5.6 Performance on POWER8"},"content":{"rendered":"<p>The following sentence is brought to you by IBM Legal: The postings on this site are my own and don&#8217;t necessarily represent IBM&#8217;s positions, strategies or opinions.<\/p>\n<p>My <a href=\"https:\/\/www.flamingspork.com\/blog\/2014\/06\/03\/mysql-5-6-on-power-patch-available\/\">previous post<\/a> covered the work needed to get MySQL 5.6.17 running reliably on modern POWER systems. The <a href=\"https:\/\/flamingspork.com\/mysql\/mysql-5.6.17-POWER.patch\">patch to MySQL 5.6.17 that&#8217;s needed is available here<\/a>.<\/p>\n<p>For those who don&#8217;t know, <a href=\"https:\/\/en.wikipedia.org\/wiki\/POWER8\">POWER8<\/a> is the latest <a href=\"https:\/\/en.wikipedia.org\/wiki\/Power_Architecture\">Power Architecture<\/a> processors from <a href=\"http:\/\/www.ibm.com\">IBM<\/a> (my employer). These chips will be available in systems from IBM in June 2014 (i.e. Real Soon Now(TM)). There&#8217;s some fairly impressive specs and numbers (see Wikipedia and elsewhere) &#8211; but what could this mean for actual applications?<\/p>\n<p>Well, it turns out that MySQL is a pretty big thing in some target markets for POWER8, and inspired by <a href=\"http:\/\/dimitrik.free.fr\/blog\/archives\/2013\/10\/mysql-performance-the-road-to-500k-qps-with-mysql-57.html\">Dimitri&#8217;s impressive benchmark numbers<\/a>, I thought we should have a go on POWER8.<\/p>\n<p>Firstly, I focused on MySQL 5.6 as it is the current stable release. MySQL 5.7 will be the subject of a future blog post.<\/p>\n<p>The first step was to ensure that MySQL 5.6 worked correctly on POWER. My <a href=\"https:\/\/www.flamingspork.com\/blog\/2014\/06\/03\/mysql-5-6-on-power-patch-available\/\">previous blog post<\/a> covered the few bugs I ran into and filed (often\u00c2\u00a0 with patches). This wasn&#8217;t too hard and I&#8217;m fairly confident the bug fixes are simple enough to get into MySQL 5.6 &#8211; I can&#8217;t comment on what would be\/could be &#8220;officially supported&#8221;, that&#8217;s a business discussion :)<\/p>\n<p>In order to ensure that my patch was not only correct but performing well, I needed a benchmark. For my initial benchmark. I chose sysbench point selects (i.e. read only key lookups), which should show the theoretical maximum queries per second you could pump through the MySQL Server as well as really stressing the mutex code, helping ensure it was not only correct, but performing well.<\/p>\n<p>A simple comparison of my early patch that used heavyweight memory barriers versus Yasufumi&#8217;s patch that used more lightweight ones showed that using heavyweight barriers could be as much as a 50% performance hit &#8211; so getting this code right is <strong>important<\/strong>.<\/p>\n<p>To add to the fun, the POWER8 processor has a few parameters you can tweak. There is the SMT mode, which dictates how many threads per core there are. This can be changed at runtime. You can be in SMT=off, SMT=2, SMT=4 or SMT=8. Typically, only some workloads can benefit from SMT8 rather than SMT4. There is also DSCR, which is data prefetching. For sysbench point selects, I&#8217;ve found we do slightly better (around 10%) when DSCR is set to 1 rather than zero &#8211; but YMMV on other benchmarks.<\/p>\n<p>In my experiments, I&#8217;ve found that SMT4 or SMT8 seems to be the best bang for buck for MySQL workloads on POWER8. With SMT=2 rather than off, I&#8217;ve seen a ~50% performance boost in sysbench point select results. With SMT=4 I&#8217;ve seen another 50% boost (i.e. roughly double SMT=off performance). The benefit of SMT8 for MySQL 5.6 (and the 5.6 part is crucial here) may be minimal, especially for this benchmark. This is mostly due to hitting heavy mutex contention inside the MySQL server rather than anything else.<\/p>\n<p>POWER8 systems come in either single or dual socket, with the number of cores being a total of 4, 6, 8, 10, 12, 16, 20 or 24 depending on configuration of the system (go check IBM web site for specifics of what&#8217;s available in what model). This means with SMT8, a dual socket, 24 core POWER8 system has 192 hardware threads &#8211; the system I was using for these benchmarks.With this number of cores and hardware threads, those familiar with MySQL on multi core systems may already have an inkling that using the full capacity of such a system may be hard for MySQL.<\/p>\n<p>Certainly for old versions of MySQL (such as 5.0 or 5.1) you&#8217;re going to get nowhere near full system utilization on POWER8. For MySQL 5.6 (and in the future, 5.7) you have a much better hope.<\/p>\n<p>Before anyone asks, yes, I used jemalloc for most of my benchmarks and it helps by giving a single digit percent performance increase (around 3-4%).<\/p>\n<p>The bottlenecks inside MySQL 5.6 for sysbench point select workload are fairly well documented, so at best we may be striving to equal the performance of other CPU architectures rather than get too much higher simply due to hitting mutex contention in creating read views inside InnoDB. So the maximum performance will be a function of individual core CPU speed and the speed at which a lock can be acquired (i.e. related to how quick you can bounce a cacheline with a lock between cores).<\/p>\n<p>This is exactly what I found on POWER8 with MySQL 5.6 &#8211; you hit the same bottleneck on POWER8 as you do everywhere else &#8211; creating read views in InnoDB.<\/p>\n<p>That being said, my maximum <strong>sysbench point select results on POWER8 was 344kQPS.\u00c2\u00a0<\/strong>This not only matches but\u00c2\u00a0<strong>exceeds<\/strong> the previous record holder by quite a decent amount.<\/p>\n<p>This number was across 8 tables with mysqld bound to a single NUMA node (6 cores) and sysbench bound to another NUMA node (6 cores) on the same socket. For this benchmark, due to the mutex contention, bringing the second socket into play didn&#8217;t improve performance. For other benchmarks, (e.g. standard sysbench read only) it seems to scale with more CPU cores much better (no doubt the subject of a future blog post).<\/p>\n<p><strong>Single table sysbench point select<\/strong> was also impressive at <strong>335kQPS<\/strong> &#8211; you only got an additional 10kQPS by going to 8 tables! All of these results were with SMT4 and DSCR=1, which seems to be the best configuration for this type of workload.<\/p>\n<p>Up next: MySQL 5.7 on POWER8.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The following sentence is brought to you by IBM Legal: The postings on this site are my own and don&#8217;t necessarily represent IBM&#8217;s positions, strategies or opinions. My previous post covered the work needed to get MySQL 5.6.17 running reliably &hellip; <a href=\"https:\/\/www.flamingspork.com\/blog\/2014\/06\/03\/mysql-5-6-performance-on-power8\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[570,14],"tags":[294,628,577,579,568],"class_list":["post-3755","post","type-post","status-publish","format-standard","hentry","category-ibm-work-et-al","category-mysql","tag-benchmark","tag-mysql","tag-mysql-5-6","tag-performance","tag-power8"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p5a6n8-Yz","jetpack-related-posts":[{"id":3778,"url":"https:\/\/www.flamingspork.com\/blog\/2014\/07\/17\/update-on-mysql-on-power8\/","url_meta":{"origin":3755,"position":0},"title":"Update on MySQL on POWER8","author":"Stewart Smith","date":"2014-07-17","format":false,"excerpt":"About 1.5 months ago I blogged on MySQL 5.6 on POWER andtalked about what I had to poke at to make modern MySQL versions run and run well on shiny POWER8 systems. One of those bugs, MySQL bug 47213 (InnoDB mutex\/rw_lock should be conscious of memory ordering other than Intel)\u2026","rel":"","context":"In &quot;code&quot;","block_context":{"text":"code","link":"https:\/\/www.flamingspork.com\/blog\/category\/code\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3770,"url":"https:\/\/www.flamingspork.com\/blog\/2014\/06\/05\/performance-impact-of-mysql-query-cache-on-modern-hardware\/","url_meta":{"origin":3755,"position":1},"title":"Performance impact of MySQL query cache on modern hardware","author":"Stewart Smith","date":"2014-06-05","format":false,"excerpt":"Recently, Morgan has been writing on deprecating some MySQL features and inspired by that while working on MySQL on POWER, I wondered \"What is the impact of the MySQL query cache on modern hardware?\" We've known for over six years (since before we started Drizzle) that the query cache hurt\u2026","rel":"","context":"In &quot;code&quot;","block_context":{"text":"code","link":"https:\/\/www.flamingspork.com\/blog\/category\/code\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":4019,"url":"https:\/\/www.flamingspork.com\/blog\/2015\/12\/18\/power8-accelerated-crc32-merged-in-mariadb-10-1\/","url_meta":{"origin":3755,"position":2},"title":"POWER8 Accelerated CRC32 merged in MariaDB 10.1","author":"Stewart Smith","date":"2015-12-18","format":false,"excerpt":"Earlier on in benchmarking MySQL and MariaDB on POWER8, we noticed that on write workloads (or read workloads involving a lot of IO) we were spending a bunch of time computing InnoDB page checksums. This is a relatively well known MySQL problem and has existed for many years and Percona\u2026","rel":"","context":"In &quot;IBM&quot;","block_context":{"text":"IBM","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/ibm-work-et-al\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":4003,"url":"https:\/\/www.flamingspork.com\/blog\/2015\/10\/19\/1-million-sql-queries-per-second-ga-mariadb-10-1-on-power8\/","url_meta":{"origin":3755,"position":3},"title":"1 Million SQL Queries per second: GA MariaDB 10.1 on POWER8","author":"Stewart Smith","date":"2015-10-19","format":false,"excerpt":"A couple of days ago, MariaDB announced that MariaDB 10.1 is stable GA - around 19 months since the GA of MariaDB 10.0. With MariaDB 10.1 comes some important scalabiity improvements, especially for POWER8 systems. On POWER, we're a bit unique in that we're on the higher end of CPUs,\u2026","rel":"","context":"In &quot;IBM&quot;","block_context":{"text":"IBM","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/ibm-work-et-al\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3899,"url":"https:\/\/www.flamingspork.com\/blog\/2014\/11\/11\/mysql-cluster-on-power8\/","url_meta":{"origin":3755,"position":4},"title":"MySQL Cluster on POWER8","author":"Stewart Smith","date":"2014-11-11","format":false,"excerpt":"So, I've written previously on MySQL on POWER, and today is a quick bit of news about MySQL Cluster on POWER - specifically MySQL Cluster 7.3.7. I ran into three main issues in getting some flexAsync benchmark results. One of them was the fact that I wanted to do this\u2026","rel":"","context":"In &quot;code&quot;","block_context":{"text":"code","link":"https:\/\/www.flamingspork.com\/blog\/category\/code\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3758,"url":"https:\/\/www.flamingspork.com\/blog\/2014\/06\/03\/mysql-5-7-on-power\/","url_meta":{"origin":3755,"position":5},"title":"MySQL 5.7 on POWER","author":"Stewart Smith","date":"2014-06-03","format":false,"excerpt":"In a previous post, I covered porting MySQL 5.6 to POWER and subsequently, some new record performance numbers with MySQL 5.6.17 on POWER8. Well, those following at home will be aware that not only is the next sentence sponsored by IBM Legal, but that MySQL 5.7 alleviates a bunch of\u2026","rel":"","context":"In &quot;code&quot;","block_context":{"text":"code","link":"https:\/\/www.flamingspork.com\/blog\/category\/code\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/3755","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/comments?post=3755"}],"version-history":[{"count":5,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/3755\/revisions"}],"predecessor-version":[{"id":3763,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/3755\/revisions\/3763"}],"wp:attachment":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/media?parent=3755"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/categories?post=3755"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/tags?post=3755"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}