Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!sdd.hp.com!wuarchive!psuvax1!rutgers!modus!gear!cadlab!martelli
From: martelli@cadlab.sublink.ORG (Alex Martelli)
Newsgroups: comp.lang.c
Subject: Re: low level optimization
Message-ID: <790@cadlab.sublink.ORG>
Date: 24 Apr 91 12:02:45 GMT
References: <1991Apr9.213601.12309@agate.berkeley.edu>
Organization: CAD.LAB, Bologna, Italia
Lines: 74

rkowen@violet.berkeley.edu (Dario Bressanini) writes:
	...
:When i REALLY HAVE to optimize a program, first of all i use 
:a profiler to see where is the bottleneck, and THEN i try to optimize it;
:probably I am biased since i mostly write programs (90% in FORTRAN)
:for scientific computation (yes, I use floating points :-) ) where usually
:you have a single routine, or even a single loop, that takes 95% of 
:the CPU time.
Ciao Dario!  In interactive tasks such as solid modeling and CAD, you
would still find (in my experience) 'single-bottlenecks' responsible for
heavy slowdown of many tasks (no, not 95%!, but 30-40% are not uncommon),
except that different tasks, and different users, would hit different
bottlenecks - as a commercial program evolves over many releases, the
various 'specific bottlenecks' are found and removed.

:In most cases the best way to gain speed was to change completely
:the algorithm, and not to make some very low level optimization.
I can't argue with that - particularly as the bottlenecks are removed
over the life of an evolving program, there are times when you must
fully redo the design of a major subsystem, including all data structures
and often even the interfaces to other subsystems, for REAL performance
improvement!

:Following the latest "this is faster than that" wars I had 
:the impression that they were pure void theoric discussions, without any
:connection with the "real world", at least to my world.
Since when do you mind 'pure void theoric discussions', DB...?-) {Personal
joke warning, me and DB are old fidonet friend/enemies!-}

:Just in case.... I don't want to start the usual and useless war
:C vs FORTRAN etc..,i would like to use C for my programs, but in most
:cases i tried, the Code produced by the C compiler was 2 or 3 times
:slower that the fortran code.
On what platforms?  On workstation-class hardware, my experience is
just the reverse: despite all the theoretical advantages of Fortran,
which SHOULD be vastly easier to optimize for numerical computation
(no aliasing allowed...), I find most Fortran-generated code quite a
bit slower than C.  I posted a benchmark once where on an incredibly pure
numerical computation, a 2D FFT over a 256x256 complex matrix, you
could even go faster by converting Fortran to C with f2c, then compiling
the C results, than by directly compiling the Fortran source!!!  That
was on Sparc workstations with C and Fortran compilers of 18 months
ago, and I hope *this* particular aberration has since been removed,
but others remain and indeed loom larger and larger (bulk unformatted
I/O for checkpointing, for example, surely an important task for any
long-running program where the operating system is not so kind as to
do checkpointing for you transparently).  On PC's, it's even worse, as
if compiler vendors barely cared for Fortran performance (particularly
in bulk I/O) but were locked in a mad race for C performance, or
something like that...
I do recall from my mainframe/supercomputing days that it USED to be
different - blazing FAST fortran-generated code, versus barely passable
C-generated code - although I DON'T recall any '2 or 3 times' difference,
possibly because I worked on machine where double precision math was
just as fast as single precision (I believe this is something of a
'trademark' of IBM machines, from 3090's down to PCs with FP coprocs,
and also including RS6000's WSs) - modern C's should allow you to use
single precision anyway where/when it matters (ANSI allows that, Sun C
while non-ANSI offers a special option for this, etc).  Anyway I'm not
surprised if your results are on supers/mainframes, since apparently
those guys are GOOD at writing Fortran compilers (maybe it is that they
CARE about it!), while their WSs colleagues appear to be either less
able or less motivated... 
If your results are instead obtained on WSs/PCs, we should exchange more
detailed notes - either you don't optimize your C well, or I don't optimize
my Fortran well...  and this DOES matter to me, since, differing from you,
I'd like to keep using Fortran for those numerically-heavy parts of our
programs where Fortran does perfectly fine, thank you, without having to
pay a performance PRICE for that, however!!!
-- 
Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 53, Bologna, Italia
Email: (work:) martelli@cadlab.sublink.org, (home:) alex@am.sublink.org
Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434; 
Fax: ++39 (51) 366964 (work only), Fidonet: 332/401.3 (home only).