Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!wuarchive!udel!nigel.ee.udel.edu!mccalpin
From: mccalpin@perelandra.cms.udel.edu (John D. McCalpin)
Newsgroups: comp.benchmarks
Subject: Re: Compilers & SPECmarks...
Message-ID: <MCCALPIN.91Apr8142743@pereland.cms.udel.edu>
Date: 8 Apr 91 18:27:43 GMT
References: <32097@shamash.cdc.com>
Sender: usenet@ee.udel.edu
Followup-To: comp.arch
Organization: College of Marine Studies, U. Del.
Lines: 39
Nntp-Posting-Host: perelandra.cms.udel.edu
In-reply-to: rrr@u02.svl.cdc.com's message of 8 Apr 91 17:02:48 GMT

>> On 8 Apr 91 17:02:48 GMT, rrr@u02.svl.cdc.com (Rich Ragan) said:

Rich> There has been discussion in the past about trying to separate
Rich> out the effects of compilers from the underlying hardware performance.
		[......]
Rich> The only thing we changed was to use a new FORTRAN compiler jointly
Rich> developed with Kuck and Associates for the multi-processor CDC 4680.

-----------------------------------------------------------------------------
            gcc  espr. li   eqntott spice doduc nasa7 matrix fpppp tomcatv
-----------------------------------------------------------------------------
Mips 6280   46.0 42.4  54.6  41.2   38.4  43.0  45.6   49.8   55.6  43.3
CDC  4680   46.0 42.4  54.6  41.2   40.3  44.0  62.4  181.7   56.5  57.5
-----------------------------------------------------------------------------
						      ^^^^^

I have worried about the inclusion of the 'matrix300' code in the SPEC
suite, as it is such a simple calculation mathematically that it is
possible that special compiler techniques can be used to greatly
enhance the performance without necessarily helping the performance of
more general codes.  

In this case, the LINPACK routines SGEMV, SGEMM, and SAXPY are used.
It is well known that SGEMM can show very large improvements by
hand-coding (I get a 33 MFLOPS vs 6 MFLOPS on my IBM RS/6000-320 by
hand-coding), so an "SGEMM-recognizer" could short-circuit the
usefulness of this benchmark considerably.  Note that the new
HP9000/730 gets a score of 273 on this test, which raises its SPEC
floating-point rating considerably!

This is not to suggest that Kuck & Associates did it this way, but the
block-mode approach that is so helpful on matrix operations is of much
more limited utility on more general array operations.

P.S. Tell us more about the multiprocessor CDC 4680!!!!
--
John D. McCalpin			mccalpin@perelandra.cms.udel.edu
Assistant Professor			mccalpin@brahms.udel.edu
College of Marine Studies, U. Del.	J.MCCALPIN/OMNET