Xref: utzoo comp.arch:17467 comp.lang.misc:5261
Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!ucsd!rutgers!deejay!gear!cadlab!staff
From: staff@cadlab.sublink.ORG (Alex Martelli)
Newsgroups: comp.arch,comp.lang.misc
Subject: Re: Compiler Costs
Message-ID: <237@cadlab.sublink.ORG>
Date: 19 Jul 90 13:13:26 GMT
References: <1797@apctrc.UUCP> <1565@uvm-gen.UUCP>
Organization: CAD.LAB, Bologna, Italia
Lines: 27

cavrak@uvm-gen.UUCP (Steve Cavrak,113 Waterman,6561483,) writes:
>...
>An example with VS-FORTRAN on an IBM-3090 with a vector processor.
>a.  plain fortran matrix		1.0
>b.  full optimized fortran		0.3
>c.  IBM's ESSL hand coded library	0.1 or better

d.  NAG's implementation, fully Fortran but "properly" coded with
    block-submatrices algorithm: within 10% of ESSL!

That's for linear-algebra stuff...  back in '88, I was working in the
same office at IBM Italy Scientific Center with the NAG guy that was
visiting there and implementing those routines; whose performance gains
were proving transferable to other machines with complex storage
hierarchies, by the way, via a simple variation in the "good-block-size"
parameter.  On the other hand, I suspect the wondrously fast and precise
library of Fortran intrinsics, coded in assembler with lots of clever
table-lookups and bit-twiddling, would suffer a LOT if it had to be
reimplemented in Fortran itself!  But in most cases, proper algorithms,
and, more specifically, appropriate data-access patterns, are THE key
to good numerical-computing performance - and Fortran remains adequate 
for expressing such algorithms and data-access patterns.
-- 
Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 45, Bologna, Italia
Email: (work:) staff@cadlab.sublink.org, (home:) alex@am.sublink.org
Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434; 
Fax: ++39 (51) 366964 (work only; any time of day or night).