Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!iuvax!bobmon From: bobmon@iuvax.cs.indiana.edu (RAMontante) Newsgroups: comp.binaries.ibm.pc.d Subject: Re: RISC vs CISC Message-ID: <30203@iuvax.cs.indiana.edu> Date: 22 Nov 89 16:35:49 GMT Reply-To: bobmon@iuvax.cs.indiana.edu (RAMontante) Distribution: comp.binaries.ibm.pc.d Organization: malkaryotic Lines: 34 bwwilson@lion.waterloo.edu (Bruce Wilson) <18323@watdragon.waterloo.edu> : - -If the RISC instructions are twice as fast but require twice as -many, what is gained? The speedup comes from (at least) three things: Number of cycles used to perform a function. Years ago it was noticed on some VAXen that some operations which had their own complex instruction for just the purpose, could nonetheless be done faster in a subroutine that used only simple instructions (I don't remember the detailsof the instruction). The simple instructions were faster, because they needed fewer clock cycles to decode and execute. The number of clock cycles needed to set up the complex instruction outweighed the number used in subroutine overhead.... Since the specialized instructions aren't commonly used anyway, you can make a strong case for keeping them out of the cpu (which simplifies its design enormously) and putting lots of sweat into doing them right in a coprocessor if they're a really good idea (e.g. trig functions). Clock rate. The simpler cpu design allows circuits to be packed into a smaller area. This means that signals don't have to travel as far --- one of the approaching hard limits on cpu speed is the "speed of light", i.e. signal propagation in a wire. Also, simpler instructions can be decoded and executed using fewer circuit elements (gates). Since each gate involves a gate delay, and the clock period must be long enough to span the longest set of gate delays, reducing these allows a faster clock. Cache, pipeline, etc. Since the cpu design is *smaller*, there is more real estate available on the chip for laying out cache memory, pipelines, and so forth, all of which speed things up. Another choice made by one design is to put in *lots* of registers; when the processor context-switches, it doesn't need to save the register state on a stack in memory, it just moves on to a different set of registers on the chip.