Path: utzoo!attcan!uunet!cbmvax!daveh From: daveh@cbmvax.commodore.com (Dave Haynie) Newsgroups: comp.sys.amiga.hardware Subject: Re: RISC Amiga WHat's RISC? Message-ID: <15798@cbmvax.commodore.com> Date: 12 Nov 90 16:16:32 GMT References: <1178@iceman.jcu.oz> <39367@ut-emx.uucp> <15759@cbmvax.commodore.com> Reply-To: daveh@cbmvax.commodore.com (Dave Haynie) Organization: Commodore, West Chester, PA Lines: 78 In article yorkw@pasture.ecn.purdue.edu (Willis F York) writes: >Well Here's a Novice Question. >I know RISC means Reduced INstruction Set Computer (Or Close) >But what does this mean? >The CPU chip has Fewer Commands (oops Instructions) that it knows how >to run? What's the big deal about that? OR an it to those few REAL fast? Lots of folks argue about what RISC is, mainly those with nothing better to do. RISC really isn't any one thing, it's kind of a bannerhead for a set of tools you might term "the late 80s approach to microprocessor design". This toolkit takes into account many ideals, but the main point seems to be that one should be processing at least one instruction every clock cycle. To this end, you have: - Simplified Instruction Set This doesn't always mean that the instructions do less work than those of a CISC machine. For example, most RISC chips have three operand arithmetic functions, versus the two operand equivalents in chips like the 68030. The basic idea is that the instruction set should be very orthogonal, stick mainly to simple instructions that can be executed in a single cycle, and don't support zillions of different addressing modes. Once you have such an instruction set, the computer design is simpler and the set can be implemented fully hardwired, rather than via microcode as in the 68030. - Load/Store Architecture Another RISC tenet is that touching main memory is the only operation likely to take more than one clock cycle. So you isolate that operation -- only load and store instructions are capable of touching memory, all others work between registers. Along with this, you'll find that RISC chips tend to have more registers, from 32 up to a hundred or more. - Pipelining In order to keep roughly one instruction per clock running, RISC designs tend to be heavily pipelined. You have several stages of execution for each instruction. So, while a single instruction may actually take 6 clock cycles to pass through the whole machine pipeline, there should be one instruction in each of these 6 slots at all times, therefore yielding an effective 1 clock/instruction. Any code that takes more than this, such as a load or store, will cause a pipeline stall, an unused slot or two in the pipeline. The long pipeline causes some strange coding practices. On many RISC chips, even accessing the same register in consecutive instructions will cause a pipeline stall, since the register hasn't quite been written in I0 by the time it need to be read by I1. So smart compilers are called for, which can manage register allocations. There are a few more concepts, but these are the basic ones. Most of the ideas that are in today's RISC devices come, either directly or a little roundabout, from the supercomputer work done by folks like Cray. And many of the same reasons are present, only at the chip level. Optimizing the size of the CPU design means many fewer gates. So you can use a faster process technology, if available, than the folks building CISCs. Or you can add much more on-chip cache in the same process technology. Or you can make the thing yield like crazy and drop the price relative to a CISC device. There really isn't anything in RISC you can't apply to CISC, for the right price. The 68040 is a good example. The most common 680x0 instructions in that device are hard wired rather than microcoded. It has a deep pipeline, with even a few innovations over most RISCs (for example, address register increments and decrements, as well as offsets, get resolved in their own pipeline stage with their own ALU, so these addressing more don't add time to the instruction execution). The 68040 also has a large on-chip cache, which unlike the caches in most other chips, responds in a single cycle for cache hits, making it nearly as fast as register access. On the downside, all this extra logic has made the 68040 take up 1.2 million devices in a 0.8 micron CMOS process. You aren't going to see this move into GaAS or ECL any time soon, whereas the MIPS folks already have an ECL version of their MIPS architecture, the R6000. And while 68040s will most likely be in the several 100 $ range for some time, you can get RISC chips at or near the same performance level for $100, maybe even a little less. -- Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy Standing on the shoulders of giants leaves me cold -REM