Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!amdcad!crackle!tim From: tim@crackle.amd.com (Tim Olson) Newsgroups: comp.lang.c Subject: Re: Fortran computes cosine 300 times faster than C (on Sun3) Message-ID: <24764@amdcad.AMD.COM> Date: 8 Mar 89 18:30:12 GMT References: <765@uceng.UC.EDU> Sender: news@amdcad.AMD.COM Reply-To: tim@amd.com (Tim Olson) Distribution: na Organization: Advanced Micro Devices, Inc. Sunnyvale CA Lines: 85 Summary: Expires: Sender: Followup-To: In article <765@uceng.UC.EDU> achhabra@uceng.UC.EDU (atul k chhabra) writes: | I chanced upon a segment of code that runs approximately 300 times faster in | FORTRAN than in C. I have tried the code on Sun3(OS3.5) and on Sun4(OS4.0) | (of course, on Sun4 the -f68881 flag was not used.) The results are similar | on both machines. Can anyone enlighten me on this bizzare result? Welcome to the world of benchmarking. You can see what happened if you take a look at the assembly-language generated by the compilers. In the FORTRAN version, there is no call to the cosine routine; only an empty loop remains. This is because cosine is a FORTRAN intrinsic which the compiler knows about. Since you didn't use any of the results of the cosine calls, the compiler was able to eliminate it entirely as "dead code". The C version had to keep the cosine function calls, because it isn't an intrinsic function in K&R C, so the compiler knows nothing of what it does (it may have side-effects). To get more realistic numbers, you have to "fake out" the compiler, by using the results of the calls: ________________________________________ /* * Compile using: * cc -f68881 -O -o cosc cosc.c -lm. */ #include float bench() { int i; float tmp; for(tmp=0.0,i=0;i<262144;i++) tmp+=cos(2.5)*cos(2.5)*cos(2.5)*cos(2.5); return tmp; } main() { float tmp; tmp = bench(); } ________________________________________ c f77 -f68881 -O -o cosf cosf.f c real function bench() integer i real tmp tmp = 0.0 do 10 i=1,262144 tmp = tmp+cos(2.5)*cos(2.5)*cos(2.5)*cos(2.5) 10 continue bench = tmp end program cosf real tmp1 tmp1 = bench() end ________________________________________ On a Sun 4/110: crackle49 time cosc 35.3u 0.5s 0:37 95% 0+144k 1+0io 2pf+0w crackle50 time cosf 19.4u 0.3s 0:20 96% 0+232k 0+0io 0pf+0w This difference is mainly due to floating-point math being performed in double-precision in C, vs. single-precision in FORTRAN. -- Tim Olson Advanced Micro Devices (tim@amd.com)