Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!cs.utexas.edu!usc!apple!sun-barr!newstop!sun!chiba!khb From: khb%chiba@Sun.COM (chiba) Newsgroups: comp.lang.fortran Subject: Re: C vs. FORTRAN Message-ID: <119590@sun.Eng.Sun.COM> Date: 4 Aug 89 16:51:25 GMT References: <3289@ohstpy.mps.ohio-state.edu> Sender: news@sun.Eng.Sun.COM Reply-To: khb@sun.UUCP (chiba) Organization: Sun Microsystems, Mountain View Lines: 106 In article <3289@ohstpy.mps.ohio-state.edu> SMITHJ@ohstpy.mps.ohio-state.edu writes: > >What are the reasons for using FORTRAN over C? > >I have heard claims that the code is easier to optimize and that current >benchmarks show that this is the case on all mainframes. My experience is >that this is pure bu****it. No it is quite true. > Have done my own testing, I find that C runs >faster on PC's. Ah. A large sample space, involving lots of different computer architectures. Consider how optimizers work. Three very basic (and not the only) problems exist: 1) aliasing. It is not legal to move two aribtrary lines of c code...since nearly all lines of c code contain unconstrianed pointer references. This inhibits 99% of all code motion....which is key to high performance platforms from the cdc6600 on. This is an open problem, and is considered very hard. The fact that no one has solved it despite a decade of effort should be noted. 2) for vs do do loops are primitive, but map into all known hardware. for loops are more elegent, but since one is allowed to store into the index variable, many useful optimizations are harder (not impossible, as gcc proves). a simple program which does a simple loop, on sun cc and sun f77 and computes anything, runs faster in f77. gcc is more clever, but only for simple loops. With work simple special cases can be detected, but it is work that is unnecessary for f77... and there is a finite amount of time folks are willing to wait for a compile ... so if one has to do extensive analysis to get BACK to the point at which f77 STARTS other optimizations will have to be skipped. 3) many things which are OPERATORS in f77 are LIBRARY routines in C. See note below. c optimizers cannot do as good a job as f77 optimizers, without doing much more extensive analysis...and if one has the computational resources to do a better analysis, the f77 optimizer could do even better. The counter-examples are all on small computers, where the f77 compilers are not really optimizing (marketing hype to the contrary). ... excerpt from a memo from dgh about a particular complaint about c vs f77 "cos" in Fortran is an OPERATOR. The compiler knows what it means. It may generate a 68881 fcos instruction immediately. To get something different you have to instruct the compiler that in this program "cos" does not mean cosine. "cos" in C is a FUNCTION. The compiler does NOT know what it means. The compiler must generate a function call to an external function which is found in a library by the linker. There is no way to instruct the compiler that "cos" means cosine. SVID and ANSI C aggravate this problem by requiring a lot of ill-considered errno setting in the elementary transcendental functions so a simple fcos instruction isn't adequate. This is a fundamental distinction between C and Fortran. On Suns, in order to get close to comparable performance between double precision C and double precision Fortran, you need to compile with the same optimization (might as well use -O4 routinely when you want to optimize, it usually doesn't hurt) and use inline expansion templates which are a kluge around the problem. They also solve the SVID and ANSI C problem the way Cray does, by ignoring them. To get comparable performance in single precision, you also have to compile the C program with -fsingle. The sunspots poster aggravated the problem by using a dumb benchmark that can be readily optimized away to nothing if you know what "cos" means as does a Fortran compiler. .... As machines get more interesting (long pipes, multiple functional units, etc.) the performance difference between C and f77 becomes more pronounced. Since C compilers cannot restructure the code as extensively, the size of basic blocks tends to be small (as recent posters to comp.arch have been noting ... w/o noticing that the frequent branch instructions could be avoided) and the hardware is poorly exploited. Keith H. Bierman |*My thoughts are my own. Only my work belongs to Sun* It's Not My Fault | Marketing Technical Specialist ! kbierman@sun.com I Voted for Bill & | Languages and Performance Tools. Opus (* strange as it may seem, I do more engineering now *)