Path: utzoo!mnetor!uunet!husc6!cmcl2!nrl-cmf!mailrus!tut.cis.ohio-state.edu!bloom-beacon!mit-eddie!bbn!rochester!cornell!batcomputer!itsgw!imagine!pawl22.pawl.rpi.edu!jesup From: jesup@pawl22.pawl.rpi.edu (Randell E. Jesup) Newsgroups: comp.arch Subject: Re: 16 & 32 bit vs 32 bit only instructions for RISC. Message-ID: <485@imagine.PAWL.RPI.EDU> Date: 7 Mar 88 06:46:12 GMT References: <2574@im4u.UUCP> <9740@steinmetz.steinmetz.UUCP> <7538@apple.Apple.Com> <1757@mips.mips.COM> Sender: news@imagine.PAWL.RPI.EDU Reply-To: beowulf!lunge!jesup@steinmetz.UUCP Organization: RPI Public Access Workstation Lab - Troy, NY Lines: 68 In article <1757@mips.mips.COM> hansen@mips.COM (Craig Hansen) writes: > Instruction bandwidth is >important, but not so important that you should go back to compacted >instructions. 32-bit instructions aren't much larger than 16-bit >instructions, particularly when a register-allocating compiler is >used, and the benefit to permitting parallel decoding of instructions >with register fetching is a tremendous win. Who said we were going back to compacted instructions to get to 16 bits? What we have is a lot LESS instructions, and a minimal set of formats for instructions. With 32 bit instructions, we could have had less formats (2 or 3 instead of 5 or 6), but since we can do a decode in a single pipe-stage, what does it matter? The decoder does not determine the critical path and cycle time, the ALU does. If the decoder had slowed us up, we would have made it faster and/or reduced the number of formats. (Keeping the number of formats down and alignment of fields did play a role in our architecture design.) I don't see how having a register allocating compiler affects instruction size. Concerning parallel decode with register fetch, is this anything unusual? Our pipeline looks like this: Instruction fetch - doesn't really exist per se. Instruction Decode/register fetch ALU operation WriteBack - write Alu result to register file [ I'm ignoring the extra load stages here ] I don't see what we're losing here. >Generally, optimized MIPS code about 10% to 50% larger than "optimized" >VAX code, as generated by 4.3 UNIX, and is often equal or smaller in >size than optimized 68k code, as generated by Sun compilers. [ many figures deleted ] >...those big 32-bit instructions don't look so bad next to >the machines design for compact encodings... 68020? compact? Surely you jest! :-) I know, it actually is fairly compact, at least the 68000 part of it. It just has SO many instructions and addressing modes, it ends up larger than one would suspect. The proper comparison is not to CISCs, but to a 16-bit version of the same general architecture, or at least the same class (RISCs). I agree that if cost is no object, a 32-bit RISC can probably run faster (effective throughput (VIPS), not MIPS) than a 16-bit. However, the costs mentioned include a higher-bandwidth bus, more disk space for code, more memory space for code, larger (expensive) caches, more power draw (more pins being driven), etc, etc. The typical current solution for the bus bandwidth problem is to throw MUCH bigger caches onto the CPU board, to try to increase hit rates, and reduce bandwidth required of the bus. >Craig Hansen >Manager, Architecture Development >MIPS Computer Systems, Inc. Glad to see in in the conversation. I'm interested in hearing your opinions. // Randell Jesup Lunge Software Development // Dedicated Amiga Programmer 13 Frear Ave, Troy, NY 12180 \\// beowulf!lunge!jesup@steinmetz.UUCP (518) 272-2942 \/ (uunet!steinmetz!beowulf!lunge!jesup) BIX: rjesup (-: The Few, The Proud, The Architects of the RPM40 40MIPS CMOS Micro :-)