{"id":1587,"date":"2009-04-08T14:28:08","date_gmt":"2009-04-08T04:28:08","guid":{"rendered":"http:\/\/www.flamingspork.com\/blog\/?p=1587"},"modified":"2009-04-08T14:28:54","modified_gmt":"2009-04-08T04:28:54","slug":"using-dtrace-to-find-out-why-solaris-is-so-slow-and-if-its-actually-solaris","status":"publish","type":"post","link":"https:\/\/www.flamingspork.com\/blog\/2009\/04\/08\/using-dtrace-to-find-out-why-solaris-is-so-slow-and-if-its-actually-solaris\/","title":{"rendered":"Using Dtrace to find out if the hardware or Solaris is slow (but really just working around the problem)"},"content":{"rendered":"<p>A little while ago, I was the brave soul tasked with making sure Drizzle was working properly and passing all tests on Solaris and OpenSolaris.<a href=\"http:\/\/krow.livejournal.com\/634896.html\"> Brian recently blogged<\/a> about some of the advantages of also running on Solaris and the SunStudio compilers &#8211; more warnings from the compiler is a good thing. Many kudos goes to Monty Taylor for being the brave soul who fixed most of the compiler warnings (and for us, warnings=errors &#8211; so we have to fix them) for the SunStudio compilers before I got to making te tests work.<\/p>\n<p>So, I got to the end of it all and got pointed to an OpenSolaris x86 box where the drizzleslap test was timing out. The timeout for tests is some amazingly long amount of time &#8211; 15 minutes. All the drizzle-test-run tests are rather short tests.<\/p>\n<p>To make running the tests quick, I usually LD_PRELOAD <a href=\"https:\/\/launchpad.net\/libeatmydata\">libeatmydata<\/a> &#8211; a simple way of disabling pesky things like fsync that take a long time (rumors that I nickname it libmacosxsimulation are entirely true). It&#8217;s pretty simple to build libeatmydata on Solaris too (I periodically do this and always intend to check in the associated Makefile but never do).<\/p>\n<p>Unfortunately, on OpenSolaris a bunch of things are built 32bit and others 64bit and just doing &#8220;LD_PRELOAD=libeatmydata.so .\/dtr&#8221; doesn&#8217;t work &#8211; I&#8217;d have to modify the test script to only do the LD_PRELOAD for drizzled &#8211; which is annoying.<\/p>\n<p>On my T1000 running Debian, the drizzleslap test takes 42 seconds to complete with libeatmydata, or 393 seconds when it&#8217;s really doing fsyncs. So for it to be timing out on this OpenSolaris x86 box &#8211; i.e. taking more than 15 minutes, was strange.<\/p>\n<p>So&#8230; what was going on? Step 1: is anything actually going on? One way to test this is to see if disk IO is being generated. On Linux, we can use &#8220;iostat&#8221;. On Solaris, we can use &#8220;zpool iostat&#8221;. Things were going to disk for the whole time of the test. Time to compare what the difference between the platforms is.<\/p>\n<p>Well.. a typical way that tests have taken forever have been because of lots of transactions: i.e. lots of fsync(). You are then dependent on the fsync() performance.<\/p>\n<p>If we look at &#8220;iostat -x&#8221; and the avgrq-sz field on Linux, we&#8217;ll see that the average request size is on 10-12 sectors (512 byte blocks). i.e. about 5 or 6kb.<\/p>\n<p>If we look at &#8220;zpool iostat 1&#8221; on OpenSolaris, we see a bit of a different story, but similar enough that you could safely assume that lots of small synchronous IOs were going on. After a bit of reading of the ZFS on-disk format documents, I had a slightly better idea what was going on that could be causing me seeing a larger average request size on ZFS than on Linux with XFS.<\/p>\n<p>So&#8230; perhaps it&#8217;s the speed of these syncs? Ordinarily, I&#8217;d just write up a quick LD_PRELOAD library that wraps fsync() and times it (perhaps writing to a file so I could do analysis on it later). Since I was working on Solaris&#8230; I thought I&#8217;d try DTrace. Some google-foo and dtrace hacking later, I tried this:<\/p>\n<p>stewart@drizzle-dev:~\/drizzle\/sparc$ time pfexec dtrace -n &#8216;syscall::fdsync:entry \/execname == &#8220;drizzled&#8221; \/ { self-&gt;ts[self-&gt;stack++] = timestamp; } syscall::fdsync:return \/self-&gt;ts[self-&gt;stack &#8211; 1]\/ { this-&gt;elapsed = timestamp &#8211; self-&gt;ts[&#8211;self-&gt;stack]; @[probefunc] = count(); @a[probefunc] = quantize(this-&gt;elapsed); self-&gt;ts[self-&gt;stack] =0; }&#8217;<\/p>\n<pre>dtrace: description 'syscall::fdsync:entry ' matched 2 probes\r\n^C\r\n\r\n  fdsync                                                         1600\r\n  fdsync\r\n           value  ------------- Distribution ------------- count\r\n        33554432 |                                         0\r\n        67108864 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@   1520\r\n       134217728 |@@                                       79\r\n       268435456 |                                         1\r\n       536870912 |                                         0        \r\n\r\nreal\t4m26.837s\r\nuser\t0m0.657s\r\nsys\t0m0.566s<\/pre>\n<p>Which did seem like an awful long time for an fsync() to take. Although the filesystem was on a single disk, it was meant to be made remotely recently, and it&#8217;s sitting on a Sun controller&#8230; so it should be a bit better than that. From reading some of the ZFS on-disk spec, it could be some bug that means we&#8217;re waiting for a checkpoint to be written instead of forcing the sync out when we call fsync() &#8211; but I sought another solution (as on other Solaris\/OpenSolaris systems this wasn&#8217;t a problem &#8211; so perhaps fixed in newer kernels or it&#8217;s a driver issue).<\/p>\n<p>So I went and added &#8220;&#8211;commit=100&#8221; to a bunch of places in the drizzleslap test to batch things into transactions. The idea being to greatly reduce the number of fsync() calls to bring the execution time of the drizzleslap test on the machine to get below 15minutes. A bit of jiggerypokery later (some tests needed to not have the &#8211;commit to avoid various locking foo) and I had something that should run.<\/p>\n<p>Now, ~113 seconds on the T1000 on Linux (with a single SATA disk, down from an original 393 seconds) and ~437 seconds on the OpenSolaris box. For giggles, tried it on a Solaris box that&#8217;s running UFS on a 10k RPM SAS drive: ~44 seconds.<\/p>\n<p>In Summary:<\/p>\n<p>T1000, Linux, libeatmydata, XFS: ~42 seconds (before optim)<br \/>\nT1000, Linux, 7200RPM SATA, XFS: ~113 seconds<br \/>\nT5240, Solaris 10, 10k RPM SAS, UFS: ~44 seconds<br \/>\n16 core Xeon, OpenSolaris, 7200RPM, ZFS: ~437 seconds<\/p>\n<p>So, on that hardware setup &#8211; <strong>something<\/strong> is strange. The 10k SAS drive on UFS on the CoolThreads box is really nice though&#8230;. makes me want that kind of disk here.<\/p>\n<p>This page was useful, and I used it as a basis for some of my DTrace scripts: <a href=\"http:\/\/fav.or.it\/post\/1146360\/dtrace-and-the-mighty-hercules\">http:\/\/fav.or.it\/post\/1146360\/dtrace-and-the-mighty-hercules<\/a><\/p>\n<p>Also thanks to several people on #opensolaris on Freenode who helped me out with various Solaris specific commands in tracking this down.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A little while ago, I was the brave soul tasked with making sure Drizzle was working properly and passing all tests on Solaris and OpenSolaris. Brian recently blogged about some of the advantages of also running on Solaris and the &hellip; <a href=\"https:\/\/www.flamingspork.com\/blog\/2009\/04\/08\/using-dtrace-to-find-out-why-solaris-is-so-slow-and-if-its-actually-solaris\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[75],"tags":[107,70,150,151,154,153,141,632,46,152],"class_list":["post-1587","post","type-post","status-publish","format-standard","hentry","category-drizzle-work-et-al","tag-debian","tag-drizzle","tag-dtrace","tag-fsync","tag-ld_preload","tag-libeatmydata","tag-linux","tag-opensolaris","tag-solaris","tag-t1000"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p5a6n8-pB","jetpack-related-posts":[{"id":1598,"url":"https:\/\/www.flamingspork.com\/blog\/2009\/03\/26\/drizzle-tests-all-pass-on-solarissparc\/","url_meta":{"origin":1587,"position":0},"title":"Drizzle tests all pass on Solaris\/Sparc","author":"Stewart Smith","date":"2009-03-26","format":false,"excerpt":"Stopping All Servers All 221 tests were successful. The servers were restarted 14 times Spent 1424.921 of 1521 seconds executing testcases (All tests have passed on OpenSolaris on x86 for a while now).","rel":"","context":"In &quot;drizzle&quot;","block_context":{"text":"drizzle","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/drizzle-work-et-al\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1146,"url":"https:\/\/www.flamingspork.com\/blog\/2008\/07\/16\/adventures-with-opensolaris\/","url_meta":{"origin":1587,"position":1},"title":"Adventures with OpenSolaris","author":"Stewart Smith","date":"2008-07-16","format":false,"excerpt":"So... some colleagues have been experimenting with DTrace a bit, and I've been (for a while now) wanting to experiment with it. The challenge now, instead of in the past, is that I'm setting up a Solaris based system - not getting one premade. I chose OpenSolaris as I'd previously\u2026","rel":"","context":"In &quot;mysql&quot;","block_context":{"text":"mysql","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/mysql\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2008\/07\/screenshot-300x198.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":1276,"url":"https:\/\/www.flamingspork.com\/blog\/2008\/12\/02\/is-your-garage-internet-enabled\/","url_meta":{"origin":1587,"position":2},"title":"Is your garage internet enabled?","author":"Stewart Smith","date":"2008-12-02","format":false,"excerpt":"Real noisy fucker. So loud, that if it's in the garage but the back door is open, I still hear it. Being used for drizzle dev on Solaris... although a switch to OpenSolaris or Linux is likely imminent. Straight Solaris 10 is just too annoying.","rel":"","context":"In &quot;drizzle&quot;","block_context":{"text":"drizzle","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/drizzle-work-et-al\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2008\/12\/pic-0025-225x300.jpg?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":1161,"url":"https:\/\/www.flamingspork.com\/blog\/2008\/08\/05\/good-adventures-with-opensolaris\/","url_meta":{"origin":1587,"position":3},"title":"Good adventures with OpenSolaris","author":"Stewart Smith","date":"2008-08-05","format":false,"excerpt":"First of all, thanks to everyone who commented on my previous OpenSolaris entry (which wasn't really positive at all). I recently tried again - this time starting with an ISO of build 93. I'd recommend completely ignoring the 2008.05 release and going straight for the build 93 image. Installed easily\u2026","rel":"","context":"In &quot;mysql&quot;","block_context":{"text":"mysql","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/mysql\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1296,"url":"https:\/\/www.flamingspork.com\/blog\/2008\/12\/11\/opensolaris-200811-first-impressions\/","url_meta":{"origin":1587,"position":4},"title":"OpenSolaris 2008.11 first impressions","author":"Stewart Smith","date":"2008-12-11","format":false,"excerpt":"Using the wonderful bittorrent, I got the CD image in next to no time (in contrast to the Solaris Express image I'm currently downloading via HTTP that's taking forever). Boot time in VirtualBox (off the ISO image) was rather quick, usual questions on keyboard layout and desired language (it'd be\u2026","rel":"","context":"In &quot;sun&quot;","block_context":{"text":"sun","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/sun\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1320,"url":"https:\/\/www.flamingspork.com\/blog\/2008\/12\/18\/virtualbox-210-and-opensolaris-200811\/","url_meta":{"origin":1587,"position":5},"title":"VirtualBox 2.1.0 (and OpenSolaris 2008.11)","author":"Stewart Smith","date":"2008-12-18","format":false,"excerpt":"Upgraded VirtualBox and booted up my OpenSolaris VM. VirtualBox 2.1.0 finally fixes the bug where if 127.0.0.1 was in resolv.conf on the host - no DNS for you in the guest (unless in the guest you were running a DNS server). Haven't tried it yet... but OpenGL Accelleration makes at\u2026","rel":"","context":"In &quot;drizzle&quot;","block_context":{"text":"drizzle","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/drizzle-work-et-al\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/1587","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/comments?post=1587"}],"version-history":[{"count":4,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/1587\/revisions"}],"predecessor-version":[{"id":1608,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/1587\/revisions\/1608"}],"wp:attachment":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/media?parent=1587"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/categories?post=1587"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/tags?post=1587"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}