Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!cs.utexas.edu!usc!apple!sun-barr!newstop!sun!chiba!khb
From: khb%chiba@Sun.COM (chiba)
Newsgroups: comp.lang.fortran
Subject: Re: C vs. FORTRAN
Message-ID: <119590@sun.Eng.Sun.COM>
Date: 4 Aug 89 16:51:25 GMT
References: <3289@ohstpy.mps.ohio-state.edu>
Sender: news@sun.Eng.Sun.COM
Reply-To: khb@sun.UUCP (chiba)
Organization: Sun Microsystems, Mountain View
Lines: 106

In article <3289@ohstpy.mps.ohio-state.edu> SMITHJ@ohstpy.mps.ohio-state.edu writes:
>
>What are the reasons for using FORTRAN over C?
>
>I have heard claims that the code is easier to optimize and that current
>benchmarks show that this is the case on all mainframes.  My experience is
>that this is pure bu****it. 

No it is quite true.

> Have done my own testing, I find that C runs >faster on PC's.

Ah. A large sample space, involving lots of different computer architectures.

Consider how optimizers work. Three very basic (and not the only)
problems exist:

1)	aliasing. It is not legal to move two aribtrary lines
	of c code...since nearly all lines of c code contain
	unconstrianed pointer references. This inhibits 99%
	of all code motion....which is key to high performance
	platforms from the cdc6600 on.

	This is an open problem, and is considered very hard.
	The fact that no one has solved it despite a decade of
	effort should be noted.

2)	for vs do

	do loops are primitive, but map into all known hardware.
	for loops are more elegent, but since one is allowed
	to store into the index variable, many useful optimizations
	are harder (not impossible, as gcc proves).

	a simple program which does a simple loop, on sun cc and sun f77
	and computes anything, runs faster in f77. gcc is more clever,
	but only for simple loops.

	With work simple special cases can be detected, but it
	is work that is unnecessary for f77... and there is a finite
	amount of time folks are willing to wait for a compile ...
	so if one has to do extensive analysis to get BACK to the point
	at which f77 STARTS other optimizations will have to be skipped.

3)	many things which are OPERATORS in f77 are LIBRARY routines
	in C. See note below.


c optimizers cannot do as good a job as f77 optimizers, without doing
much more extensive analysis...and if one has the computational
resources to do a better analysis, the f77 optimizer could do even
better. 

The counter-examples are all on small computers, where the f77
compilers are not really optimizing (marketing hype to the contrary).


... excerpt from a memo from dgh about a particular complaint about c
vs f77

"cos" in Fortran is an OPERATOR.  The compiler knows what it means.
It may generate a 68881 fcos instruction immediately.  To get
something different you have to instruct the compiler that in this
program "cos" does not mean cosine.

"cos" in C is a FUNCTION.  The compiler does NOT know what it means.
The compiler must generate a function call to an external function
which is found in a library by the linker.  There is no way to
instruct the compiler that "cos" means cosine.  SVID and ANSI C
aggravate this problem by requiring a lot of ill-considered errno
setting in the elementary transcendental functions so a simple fcos
instruction isn't adequate.

This is a fundamental distinction between C and Fortran.  On Suns, in
order to get close to comparable performance between double precision
C and double precision Fortran, you need to compile with the same
optimization (might as well use -O4 routinely when you want to
optimize, it usually doesn't hurt) and use inline expansion templates
which are a kluge around the problem.  They also solve the SVID and
ANSI C problem the way Cray does, by ignoring them.

To get comparable performance in single precision, you also have to
compile the C program with -fsingle.

The sunspots poster aggravated the problem by using a dumb benchmark
that can be readily optimized away to nothing if you know what "cos"
means as does a Fortran compiler.

....

As machines get more interesting (long pipes, multiple functional
units, etc.) the performance difference between C and f77 becomes more
pronounced. Since C compilers cannot restructure the code as
extensively, the size of basic blocks tends to be small (as recent
posters to comp.arch have been noting ... w/o noticing that the
frequent branch instructions could be avoided) and the hardware is
poorly exploited.


Keith H. Bierman      |*My thoughts are my own. Only my work belongs to Sun*
It's Not My Fault     |	Marketing Technical Specialist    ! kbierman@sun.com
I Voted for Bill &    |   Languages and Performance Tools. 
Opus  (* strange as it may seem, I do more engineering now     *)