Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!lll-tis!ames!lamaster From: lamaster@ames.arpa (Hugh LaMaster) Newsgroups: comp.arch Subject: Gate counts for implementations of architectures Message-ID: <3793@ames.arpa> Date: 30 Dec 87 21:07:55 GMT Reply-To: lamaster@ames.UUCP (Hugh LaMaster) Organization: NASA Ames Research Center, Moffett Field, Calif. Lines: 40 Keywords: RISC, gates, instruction set One of the questions that not been discussed much in the RISC discussion is the amount of chip real estate that must be devoted to implementing the instructions in a given architecture. The question of critical paths for branch instructions and addressing modes is certainly significant, but chip area is a somewhat different question. Initial enthusiasm aside, the RISC question is largely one of figuring out which instructions, and their hardware realization, give the most speed for a set of applications. A recent Computer magazine had a list which showed some GaAs processors with very fast clocks, but very small gate counts. This is an extreme example of how it has always been when building fast machines: Is it worth it to add a particular function? For example, various Cray CPU's have had about 500K gates in them: this includes hardware integer add, multiply, divide, floating add, multiply, reciprocal approximation (fully segmented), and shift/mask instructions. The Cyber 205 has about twice as many gates - and has a slower clock speed. Are the extra gates worth it on the Cyber 205, even at the cost of having a slower clock speed (note: it is easy to find applications which make either machine look faster)? If less is more (RISC), how about 10K gates maximum, if that is what can be put on a single GaAs microprocessor? Maybe I can simulate floating point with integer arithmetic faster than having special f.p. hardware, if by skipping f.p. I can put a processor on one very fast single chip. Question: What are the gate counts for various implementations of the same architecture (It would be illuminating to complare a 360/50 with a 360/91 for example - same architecture, but one processor pipelined, with fast floating point), and of different architectures? What instructions increase gate count inordinately? Are there particular "bad guy" instructions which take up a lot of space (I mean besides floating point instructions, and issue in itself...) Would anyone from Sun, MIPS, or AMD care to comment on how many gates there are in their processors - and their floating point coprocessors? And what about the anti-RISC argument which says that microcoded machines are more efficient with chip area because they have less random logic? (All gates are not equal: physically regular gates are more equal than random gates)?