Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!brutus.cs.uiuc.edu!apple!oliveb!mipos3!omepd!mipon2.intel.com!mcg
From: mcg@mipon2.intel.com (Steven McGeady)
Newsgroups: comp.arch
Subject: Re: 55 MIPS & 66 MIPS
Message-ID: <5278@omepd.UUCP>
Date: 28 Nov 89 03:31:37 GMT
References: <22514@gryphon.COM> <1358@bnr-rsc.UUCP> <31329@winchester.mips.COM> <22303@gryphon.COM> <3024@brazos.Rice.edu>
Sender: news@omepd.UUCP
Reply-To: mcg@mipon2.intel.com (Steven McGeady)
Lines: 66

In article <22514@gryphon.COM>, scarter@gryphon.COM (Scott Carter) writes:
>
> 2) I'm not sure that any meaningful extrapolation can be made from the 860 to
> the 960CA, given that their instruction parallelism mechanisms are utterly
> different.  Comparison to something like the Super Titan (on integer codes)
> would be rather more appropriate.

No meaningful comparison is useful here.  The 860 is a floating-point near-VLIW
processor, the 960 is an integer superscalar embedded processor.  The 860
achieves parallelism between floating-point and integer operations using
parallel pipelines, the 960 achieves parallelism between integer and memory
operations by using parallel instruction dispatch.

> claim 66 Native Mips is not a priori any more illegitimate than most other
> vendors native MIPS claims.

In technical forums, I have always been careful to distinguish the cases where
the 960CA could be expected to run at this rate.

> 4) I would disagree about the Mips _Ada_ compiler being better than the
> Intel/Biin 960 Ada compiler (agree wholeheartedly on C/Pascal/FORTRAN). 

While the original MIPS/Verdix Ada compiler was not up to snuff with their
C technology, it was still reasonably good.  MIPS has released new numbers
(the ones that Mr. Hawkes referred to) based on a new release of their
compiler.

> We found that the performance ratio between the R3000 and the 960XA was much
> wider on [somewhat larger than JIAWG] our own benchmarks in C, Pascal, and
> FORTRAN than in Ada, either JIAWG or some other internal benchmarks.

As I mentioned in a previous article, this ignores the following facts:

	1) the 960MC/960XA is the original silicon generation of the 960
	   architecture, and is wholly unrelated to the 960CA -- you can
	   expect us to apply the CA's superscalar techniques to other levels
	   of the architecture, but we're not yet saying when;

	2) the benchmarks were run on systems that are in no way comparable:
	   a PC plug-in board (or possibly the execrable Multibus-I EXV board,
	   or the 16MHz BiiN systems), versus the MIPS systems with large
	   caches.

	3) The current compiler does not attempt any CA parallel-dispatch
	   optimizations.  The 960CA was released with working silicon, but
	   unfortunately, the compilers are a little behind.

> 5) Based on the code generated for the 960XA for the JIAWG benchmarks, I have
> to say I can't believe in two instructions per clock for the 960CA on this
> set (this is a GUESS only - any data I might have cannot be posted),

As stated in other articles, I would be astonished if you got a sustained rate
of two instructions per clock over the balance of a large benchmark.
Parallel instruction dispatch is much more complicated than this - the idea is
to reduce the overall latency of instructions.  I have noted several times that
we expect that parallel instruction dispatch will allow us to bring our
cycles per instruction down to very close to 1 instruction per clock in this
generation of chips, which is substantially better than most other archictures
when you consider that 960 code is 20-30% denser than comparable RISCs.

> 6) If we need to express our religious loyalty, mine is with the R3000.

No suprise here - I'll leave my loyalty as an exercise to the reader.

S. McGeady
Intel Corp.


Brought to you by Super Global Mega Corp .com