{"id":1405,"date":"2009-02-18T18:37:53","date_gmt":"2009-02-18T08:37:53","guid":{"rendered":"http:\/\/www.flamingspork.com\/blog\/?p=1405"},"modified":"2009-02-21T13:24:20","modified_gmt":"2009-02-21T03:24:20","slug":"fun-with-387","status":"publish","type":"post","link":"https:\/\/www.flamingspork.com\/blog\/2009\/02\/18\/fun-with-387\/","title":{"rendered":"Fun with the 387"},"content":{"rendered":"<p>Filed\u00c2\u00a0 <a href=\"http:\/\/gcc.gnu.org\/bugzilla\/show_bug.cgi?id=39228\">GCC bug 39228<\/a>:<\/p>\n<pre>#include &lt;stdio.h&gt;\r\n#include &lt;math.h&gt;\r\nint main()\r\n{\r\n\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 double a= 10.0;\r\n\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 double b= 1e+308;\r\n\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 printf(\"%d %d %dn\", isinf(a*b), __builtin_isinf(a*b), __isinf(a*b));\r\n\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 return 0;\r\n}<\/pre>\n<p>mtaylor@drizzle-dev:~$ gcc -o test test.c<br \/>\nmtaylor@drizzle-dev:~$ .\/test<br \/>\n0 0 1<br \/>\nmtaylor@drizzle-dev:~$ gcc -o test test.c -std=c99<br \/>\nmtaylor@drizzle-dev:~$ .\/test<br \/>\n1 0 1<br \/>\nmtaylor@drizzle-dev:~$ gcc -o test test.c\u00c2\u00a0\u00c2\u00a0 -mfpmath=sse -march=pentium4<br \/>\nmtaylor@drizzle-dev:~$ .\/test<br \/>\n1 1 1<br \/>\nmtaylor@drizzle-dev:~$ g++ -o test test.c<br \/>\nmtaylor@drizzle-dev:~$ .\/test<br \/>\n1 0 1<\/p>\n<p>Originally I found the simple isinf() case to be different on x86 than x86-64, ppc32 and sparc (32 and 64).<\/p>\n<p>After more research, I found that x86-64 uses the sse instructions to do it (and using sse is the only way for __builtin_isinf() to produce correct results). For the g++ built version, it calls __isinf() instead of inlining (and as can be seen, the __isinf() version is always correct).<\/p>\n<p>Specifically, it&#8217;s because the optimised 387 code is doing the math in double extended precision inside the FPU. 10.0*1e308 fits in 80bits but not in 64bit. Any code that forces it to be stored and loaded gets the correct result too. e.g.<\/p>\n<p>mtaylor@drizzle-dev:~$ cat test-simple.c<\/p>\n<pre>#include &lt;stdio.h&gt;\r\n#include &lt;math.h&gt;\r\nint main()\r\n{\r\n\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 double a= 10.0;\r\n\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 double b= 1e+308;\r\n\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 volatile\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 double c= a*b;\r\n\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 printf(\"%dn\", isinf(c));\r\n\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 return 0;\r\n}<\/pre>\n<p>mtaylor@drizzle-dev:~$ gcc -o test-simple test-simple.c<br \/>\nmtaylor@drizzle-dev:~$ .\/test-simple<br \/>\n1<\/p>\n<p>With this code you can easily see the load and store:<\/p>\n<pre>\u00c2\u00a08048407:\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 dc 0d 18 85 04 08\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 fmull\u00c2\u00a0 0x8048518 804840d:\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 dd 5d f0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 fstpl\u00c2\u00a0 -0x10(%ebp)\r\n\u00c2\u00a08048410:\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 dd 45 f0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 fldl\u00c2\u00a0\u00c2\u00a0 -0x10(%ebp)\r\n\u00c2\u00a08048413:\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 d9 e5\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 fxam<\/pre>\n<p>While if you remove volatile, the load and store doesn&#8217;t happen (at least on -O3, on -O0 it hasn&#8217;t been optimised away):<\/p>\n<pre>\u00c2\u00a08048407:\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 dc 0d 18 85 04 08\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 fmull\u00c2\u00a0 0x8048518\r\n\u00c2\u00a0804840d:\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 c7 44 24 04 10 85 04\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 movl\u00c2\u00a0\u00c2\u00a0 $0x8048510,0x4(%esp)\r\n\u00c2\u00a08048414:\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 08\r\n\u00c2\u00a08048415:\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 c7 04 24 01 00 00 00\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 movl\u00c2\u00a0\u00c2\u00a0 $0x1,(%esp)\r\n\u00c2\u00a0804841c:\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 d9 e5\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 fxam<\/pre>\n<p>This is also a regression from 4.2.4 as it just calls isinf() and doesn&#8217;t expand the 387 code inline. My guess is the 387 optimisation was added in 4.3.<\/p>\n<p>Recommended fix: store and load in the 387 version so to operate on same precision as elsewhere.<\/p>\n<p>Now I just have to make a patch I like that makes Drizzle behave because of this (showed up as a failure in the SQL func_math test) and then submit to MySQL as well&#8230; as this may happen there if &#8220;correctly&#8221; built.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Filed\u00c2\u00a0 GCC bug 39228: #include &lt;stdio.h&gt; #include &lt;math.h&gt; int main() { \u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 double a= 10.0; \u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 double b= 1e+308; \u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 printf(&#8220;%d %d %dn&#8221;, isinf(a*b), __builtin_isinf(a*b), __isinf(a*b)); \u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0 return 0; } mtaylor@drizzle-dev:~$ gcc -o test test.c mtaylor@drizzle-dev:~$ .\/test 0 0 1 &hellip; <a href=\"https:\/\/www.flamingspork.com\/blog\/2009\/02\/18\/fun-with-387\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[76,75,14],"tags":[81,85,83,82,84,86],"class_list":["post-1405","post","type-post","status-publish","format-standard","hentry","category-code","category-drizzle-work-et-al","category-mysql","tag-81","tag-bug","tag-floating-point","tag-fpu","tag-fun","tag-gcc"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p5a6n8-mF","jetpack-related-posts":[{"id":2323,"url":"https:\/\/www.flamingspork.com\/blog\/2011\/03\/17\/things-ive-done-in-drizzle\/","url_meta":{"origin":1405,"position":0},"title":"Things I&#8217;ve done in Drizzle","author":"Stewart Smith","date":"2011-03-17","format":false,"excerpt":"When writing my Dropping ACID: Eating Data in a Web 2.0 Cloud World talk for LCA2011 I came to the realisation that I had forgotten a lot of the things I had worked on in MySQL and MySQL Cluster. So, as a bit of a retrospective as part of the\u2026","rel":"","context":"In &quot;drizzle&quot;","block_context":{"text":"drizzle","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/drizzle-work-et-al\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1403,"url":"https:\/\/www.flamingspork.com\/blog\/2009\/02\/18\/floating-point-is-not-fun\/","url_meta":{"origin":1405,"position":1},"title":"floating point is not fun","author":"Stewart Smith","date":"2009-02-18","format":false,"excerpt":"#include <stdio.h> #include <math.h> int main() { double a= 10.0; double b= 1e+308; printf(\"%dn\",isinf(a * b)); return 0; } Prints 1 on: 64bit intel, 32bit PowerPC, 32bit SPARC, 64bit Sparc. But prints zero on 32bit intel. Oh, but if you build that with g++ instead of gcc on 32bit Intel,\u2026","rel":"","context":"In &quot;drizzle&quot;","block_context":{"text":"drizzle","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/drizzle-work-et-al\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1320,"url":"https:\/\/www.flamingspork.com\/blog\/2008\/12\/18\/virtualbox-210-and-opensolaris-200811\/","url_meta":{"origin":1405,"position":2},"title":"VirtualBox 2.1.0 (and OpenSolaris 2008.11)","author":"Stewart Smith","date":"2008-12-18","format":false,"excerpt":"Upgraded VirtualBox and booted up my OpenSolaris VM. VirtualBox 2.1.0 finally fixes the bug where if 127.0.0.1 was in resolv.conf on the host - no DNS for you in the guest (unless in the guest you were running a DNS server). Haven't tried it yet... but OpenGL Accelleration makes at\u2026","rel":"","context":"In &quot;drizzle&quot;","block_context":{"text":"drizzle","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/drizzle-work-et-al\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2196,"url":"https:\/\/www.flamingspork.com\/blog\/2010\/11\/15\/limiting-functions-to-32k-stack-in-drizzle-and-scoped_ptr\/","url_meta":{"origin":1405,"position":3},"title":"Limiting functions to 32k stack in Drizzle (and scoped_ptr)","author":"Stewart Smith","date":"2010-11-15","format":false,"excerpt":"I wonder if this comes under \"Code Style\" or not... Anyway, Monty and I finished getting Drizzle ready for adding \"\u00ef\u00bb\u00bf\u00ef\u00bb\u00bf\u00ef\u00bb\u00bf-Wframe-larger-than=32768\" as a standard compiler flag. This means that no function within the Drizzle source tree can use greater than 32kb stack - it's a compiler warning - and with\u2026","rel":"","context":"In &quot;code&quot;","block_context":{"text":"code","link":"https:\/\/www.flamingspork.com\/blog\/category\/code\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1653,"url":"https:\/\/www.flamingspork.com\/blog\/2009\/06\/09\/drizzle-tarballs-for-next-milestone-aloha\/","url_meta":{"origin":1405,"position":4},"title":"Drizzle Tarballs for next milestone: aloha","author":"Stewart Smith","date":"2009-06-09","format":false,"excerpt":"Wanting a quick build-and-play way to get Drizzle? We're dropping weekly-ish tarballs for the Aloha milestone. The latest milestone also has preliminary GCC 4.4 support You can see regular announcements on: http:\/\/planetdrizzle.org\/ http:\/\/blog.drizzle.org\/ - which is just announcements and the like.","rel":"","context":"In &quot;drizzle&quot;","block_context":{"text":"drizzle","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/drizzle-work-et-al\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1276,"url":"https:\/\/www.flamingspork.com\/blog\/2008\/12\/02\/is-your-garage-internet-enabled\/","url_meta":{"origin":1405,"position":5},"title":"Is your garage internet enabled?","author":"Stewart Smith","date":"2008-12-02","format":false,"excerpt":"Real noisy fucker. So loud, that if it's in the garage but the back door is open, I still hear it. Being used for drizzle dev on Solaris... although a switch to OpenSolaris or Linux is likely imminent. Straight Solaris 10 is just too annoying.","rel":"","context":"In &quot;drizzle&quot;","block_context":{"text":"drizzle","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/drizzle-work-et-al\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/www.flamingspork.com\/blog\/wp-content\/uploads\/2008\/12\/pic-0025-225x300.jpg?resize=350%2C200","width":350,"height":200},"classes":[]}],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/1405","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/comments?post=1405"}],"version-history":[{"count":4,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/1405\/revisions"}],"predecessor-version":[{"id":1408,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/1405\/revisions\/1408"}],"wp:attachment":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/media?parent=1405"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/categories?post=1405"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/tags?post=1405"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}