Path: utzoo!censor!geac!torsqnt!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!sdd.hp.com!ucsd!ogicse!borasky
From: borasky@ogicse.ogi.edu (M. Edward Borasky)
Newsgroups: comp.benchmarks
Subject: Re: SPEC vs. Dhrystone
Message-ID: <15565@ogicse.ogi.edu>
Date: 3 Jan 91 15:46:45 GMT
References: <44342@mips.mips.COM> <15379@ogicse.ogi.edu> <44353@mips.mips.COM> <1685@marlin.NOSC.MIL> <15546@ogicse.ogi.edu> <44465@mips.mips.COM> <MCCALPIN.91Jan3100122@pereland.cms.udel.edu>
Distribution: comp.benchmarks
Organization: Oregon Graduate Institute (formerly OGC), Beaverton, OR
Lines: 61

In article <MCCALPIN.91Jan3100122@pereland.cms.udel.edu> mccalpin@perelandra.cms.udel.edu (John D. McCalpin) writes:
>>>>>> On 3 Jan 91 06:18:59 GMT, mash@mips.COM (John Mashey) said:
>
>mash> 1C: Compiler gimmickry
>mash> 	For any important benchmark that is small, compilers will get tuned
>mash> 	in ways that are absolutely useless in real life.  This has happened
>mash> 	at least with Whetstone, Dhrystone, and LINPACK.
>
>So what optimizations have been performed on the LINPACK 100x100 code
>that are "absolutely useless" in real life?

1. There was a compiler once that actually CHEATED on the Linpack bench-
mark.  What they did was, any time they encountered a routine called
"SGEFA" or "DGEFA", they checked the number of parameters and their
type, and if they agreed with the standard usage in the LINPACK bench-
mark, branched off to a machine-coded SGEFA or DGEFA, ILLEGALLY UNDER
THE FORTRAN RULES IGNORING THE SGEFA/DGEFA CODE THAT IS PRESENT IN THE
LINPACK BENCHMARK.  There were also pre-processors that stripped out
SGEFA/DGEFA or SAXPY/DAXPY, so that they would be linked in from libraries
rather than compiled.  This is a fine optimization technique IF CONTROLLED
BY THE USER -- that was the whole REASON LINPACK was written using
two levels of libraries.  But the compiler CANNOT know without looking
at the submitted code whether these cheats will produce correct results.

2. Another cheat was once pulled on the 1000-equation LINPACK benchmark.
What this vendor did was to use a Gauss-Jordan reduction, where all the
vectors are of length 1000, rather than the decreasing-length algorithm
normally used.  This in itself is legal; you can use any algorithm you
want to.  Unfortunately, the G-J reduction uses twice as many operations
as the standard one, so when they computed their MEGAFLOPS, they divided
how many FLOPS they had done by the time it took.  Unfortunately, the
Dongarra rules allow you to claim only the FLOPS required by the 
STANDARD LINPACK algorithm.  So they claimed twice the speed they were
allowed to.  In fact, they also ran the benchmark in 32 bits instead of
64, to get another 2X speed boost.  It turns out that the matrix in
the LINPACK benchmark was ill-conditioned enough to blow up on the 32-bit
arithmetic and Dongarra got suspicious.

Even discounting blatant cheats like the ones I describe above, there
is a wider issue here about how much COMPILE time a user is willing 
to expend for "perfect" compiles.  It is possible using techniques of
whole-program compiling to optimize out all the unnecessary checks in
the LINPACK benchmark -- it's simple enough and well-structured enough
so that a compiler can "realize" that DAXPY is always called with inc-
rement 1, that the DAXPY can be in-lined, that N is always greater than
zero, that one-trip DO loops will work, etc.  But these optimizations
take time -- so much time that on real codes larger than LINPACK, one 
of two things will happen:

1. The user will think the compiler has gone into an endless loop and
zap the compile, or
2. The compiler writer, probably interfacing with Marketing, will
build in lots of escapes into the optimizer so that compile times
appear to be finite.

There ARE users who claim they will tolerate long optimization times;
my experience in Marketing was that such were few and far between.  
The main uses for such a compiler are large third-party application
codes that are seldom compiled -- once or twice a year for the ones I
worked with.  But the size of these codes works AGAINST the whole-
program compiler!