Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!wuarchive!gem.mps.ohio-state.edu!apple!mips!mash
From: mash@mips.COM (John Mashey)
Newsgroups: comp.arch
Subject: Re: RISC vs CISC (rational discussion, not religious wars)
Keywords: Die Space
Message-ID: <31097@winchester.mips.COM>
Date: 9 Nov 89 16:49:47 GMT
References: <503@ctycal.UUCP> <15126@haddock.ima.isc.com> <28942@shemp.CS.UCLA.EDU>
Reply-To: mash@mips.COM (John Mashey)
Organization: MIPS Computer Systems, Inc.
Lines: 54

In article <28942@shemp.CS.UCLA.EDU> frazier@oahu.UUCP (Greg Frazier) writes:
...
>continuing speed advantages).  It seems to me that the
>real issue is what would the extra die space be used for.
>With a deeper pipeline, one could use additional gates
>without slowing the clock down.  With a CISCy enough
>CISC, one might be able to keep the pipeline full.  So,
>if we were to double the die size tomorrow, what would
>go on the chip?  Just to throw sand in our eyes, why not
>put 2 RISCS on the chip?  Big research area - should the
I don't think we're anywhere near this yet, and this can be seen by
analyzing the layout and nature of million-transistor chips [like i860s].
If you look at the i860 die, you find that:
	a) Most of the transistors are in the caches.
	b) Most of the space is the FPU, registers, integer datapath, etc.
	Some of this stuff is wires, and it doesn't shrink as well as
	transistors do.
	c) At the top speed claimed for it, eventually [50Mhz], 12KB
	of cache is NOWHERE near big enough for efficiency, by itself.
		1) As the CPU gets faster, the cache miss cost goes up,
		and the cache miss ratio must go down enough to maintain
		a constant amountof memory-system degradation.
		2) Although 8-16K of cache on a million-transistor chip
		is certainly useful, serious cache simulations say that
		it just isn't enough for a well-balanced machine at the higher
		clock rates [50Mhz or so] that one would naturally use with
		the kind of technology that gives you a million transistors.
		3) Thus, you still end up with secondary cache being needed
		in many configurations and application environments.
So that says, that when you get up to 4M transistors, maybe you get close
to having big enough caches on the chip to balance the CPU+FPU that are
there....except that now you'll want to boost the clock rate some more,
which means the caches are not as improved as you'd think [although getting
close].  Well, maybe if somebody wants to build 100MHz parts, with about
8M transistors, 128K caches, that's a sort-of balanced thing.
Maybe at the 16M-transistor point, if you still can't think of anything else to
do with more silicon [and note that the current million transistor chips
on the market or coming soon have not run out of intersting things to do
with more silicon], you put 2 CPUs on one chip, if you can figure out
a sensible cache hierarchy, and a package with few enough pins that
people can use, because, as usual, the issue is not so much in making
the CPUs run fast, it's getting the data in and out, and packaging technology
will be "interesting".

Of course, some of the numbers change if you built chips with different
mixtures.  Specifically, if you didn't care about FP, you could omit the
FPU, which is inherently a big space hog.  If you didn't need an MMU, 
that would save space also.  However, I think this only moves the potential
switch point from 1 CPU to 2 around a little.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086