Path: utzoo!utgpu!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!mailrus!ames!pasteur!aoki@faerie.Berkeley.EDU
From: aoki@faerie.Berkeley.EDU (Paul M. Aoki)
Newsgroups: comp.arch
Subject: Re: SPARC vs. MIPS on gcc
Message-ID: <8413@pasteur.Berkeley.EDU>
Date: 18 Dec 88 13:15:59 GMT
References: <82150@sun.uucp> <6476@killer.DALLAS.TX.US>
Sender: news@pasteur.Berkeley.EDU
Reply-To: aoki@postgres.Berkeley.EDU (Paul M. Aoki)
Organization: Postgres Research Group, UC Berkeley
Lines: 85

In article <6476@killer.DALLAS.TX.US> elg@killer.DALLAS.TX.US (Eric Green) writes:
>in article <82150@sun.uucp>, edkelly%aisling@Sun.COM (Ed Kelly) says:
>> For the comparison we chose a large portable C program (the GNU C Compiler rev
>> 1.24)
>Step 1: choose a program. Fine. You did that right. 

Is it necessarily right?  How about "Step 1: choose a large number of common 
integer and floating point programs"?

>> If you are interested in architecture and wish to avoid the 
>> confusion of implementation details these are the numbers of most
>> interest. 
>OK, so you captured dynamic trace statistics. So what. Lower number of
>instructions executed doesn't necessarily mean faster execution, or
>else the Vax 780 would be the world's fastest machine ;-).

Hey, I get to pull out those notes from Patterson's class again!
(Actually I'm pulling this out of [cache?] memory.)

[ CR = clock rate (cycles/sec), IC = # inst (/prog), CPI = cycles/inst, 
  P = "performance" (prog/sec) ]

"Performance" is CR/(CPI * IC).  A 11/780 may have a lower IC but the CR/CPI 
isn't at all comparable to a Sun4 or M/1000.  One the other hand, if two 
machines have similar CR/CPI figures (as these two do) the machine that 
executes the fewest instructions wins (until the technology changes again).

So IC really does matter here, and it will continue to matter a lot as
long as the CR/CPIs are comparable.

Got that?  There will a quiz at the end of this posting...

>> MIPS compiler is not significantly better than the current SPARC compiler. 
>> Considering the bad press, I will admit I was surprised by this
>> myself. 
>Doesn't surprise me too greatly.

Well, here are some more sample dynamic instruction counts from pixie 
and spixstats, in millions:

Machine:	Sun4	M/1000
Opt Level:	-O4	-O3

bison		28.5	21.9
cc1 (gcc-1.30)	10.9	12.0	[ -O2 for mips, uld dumped core at -O3 ]
compress	197	202	[ two loops ]
gnu diff	30.3	102	[ bug in mips cc ]
gnu egrep	3.3	5.1	[ one loop, difference is all nops, addr calc ]
gnu awk-1.1	28	27	[ weird code, both optimizers had a hard time ]
TimberWolf3.3	230	175
doduc		366	287	[ sun does lots of extra s<->d prec conversion ]

So it can go both ways, for both compiler and ISA reasons.  I have my 
own opinions about the compilers from looking at assembly code but 
I'll let qualified people pass official judgment on them.  [ I'm
in enough trouble, grad students aren't supposed to have opinions in 
the first place :-) ]

I find it hard to argue that SPARC is better architecturally because 
it executes fewer instructions -- it really isn't always true, and 
sometimes it REALLY isn't always true.  I mean -- sweeping generalities
based on a sample of one?

>				  The register windows compensate quite
>well for outdated compiler technology, which is why the UCB guys used
>them in the first place (so they could re-target PCC, instead of
>having to dig up come compiler guys to do a moby optimizing hack).

Well, he wasn't *just* talking about loads and stores...

>> The opinions here are my own and do not necessarily represent those of
>> Sun Microsystems.
>Are you sure?
>I mean, it sounded so lot like a product of the Sun Microsystems PR
>department! (except that they would not be so clumsy about it, of
>course). 

Sigh.  Looks like the RISC wars really are on again, bigger and badder 
than ever ... 

[ OK, so I lied about the quiz. ]
----------------
Paul M. Aoki
CS Division, Dept. of EECS // UCB // Berkeley, CA 94720		(415) 642-1863
aoki@postgres.Berkeley.EDU					...!ucbvax!aoki