Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!unmvax!pprg.unm.edu!hc!lanl!jlg From: jlg@lanl.gov (Jim Giles) Newsgroups: comp.arch Subject: Re: Compiling - RISC vs. CISC Message-ID: <13980@lanl.gov> Date: 11 Jul 89 06:02:07 GMT References: <2190@oakhill.UUCP> Organization: Los Alamos National Laboratory Lines: 84 From article <2190@oakhill.UUCP>, by davet@oakhill.UUCP (David Trissel): > In article <13976@lanl.gov> jlg@lanl.gov (Jim Giles) writes: > [...] >>For a RISC machine, the only hard part of the "back end" is register >>allocation. > > What about the required pairing of registers for double wide operations > such as floating-point or shifting? In what way does a machine which requires register pairing qualify as a RISC? If an instruction requires 2 operands, they should be allowed to be any two general purpose registers. Furthermore, you are assuming that floating point is larger than other intrinsic types. The best RISCs are those which only have _one_ data size. (By the way, my model of a reasonable RISC would be a Cray-I instruction set without vectors. This is certainly RISCy - all data is 64 bits, all operations are reg to reg, only one memory addressing mode, etc..) > [...] >>Instruction selection is fairly simple since there is >>generally only one way the perform each intermediate code operation. > > This is a strange statement. Since in general terms CISC instruction sets are > supersets of RISC models then why are the "extra" available CISC > instructions mandated to be used by a CISC compiler? Indeed, one of the > arguments for RISC is the elimination of "unused" instructions from the > instruction set. Although this may bring up important architectural > differences between RISC and CISC it has no bearing on the complexity > of a compiler. This is _really_ a strange statement. Since the supposed advantage of CISC is the richer instruction set, failure to use it would not take advantage of the machine. I've heard CISC designers claim that individual instructions can be allowed to be slower than possible in order to provide the additional instructions. If you are not using those extra instructions, you might as well have a RISC which provides only the instructions you _do_ use. The hardware designer could then spend more time making those work faster instead of making sure that the unused instructions work. So, this issue _does_ have a bearing on the complexity of the compiler. If you are not willing to provide the sophisticated compilers required to adequately use a machine, you have wasted money (read: design effort, chip space, etc.) on the hardware. >>On a pipelined machine, code ordering comes into play (at least if >>you want optimized code). This compilcates matters, since a different >>code ordering makes different register allocation constraints. >>For this reason, optimizing can be difficult, even on a RISC machine. > > But as CISC implementations become more advanced the applicability of code > reordering is starting to surface there as well. _EXACTLY_!!!!! All the optimizations required on a RISC are also required on a CISC. CISC just adds more complexity to the mix. > [... example with C: *p++ ...] > mov.l (%an),%dn > add.l &4,%an > or the faster > mov.l (%an)+,%dn > The 68K requires a routine in the compiler peephole optimizer to "discover" > and implement this optimization. But the result is a single 16-bit instruction > which (I think) executes in a single clock on the MC68040. Exactly my point. There are actually several other possibilities for instruuction selection in this case. For example, p may already be resident in a register. The use of the data may require it to end up in a register. The further use of p may require it to be left in a register. Etc.. Only a pretty sophisticated compiler can determine which instructions to use in each context. By contrast, there is only _one_ instruction sequence which will work on most RISC machines: load p, load data, store data, increment p, store p. The 'peephole' optimizer need only discover the redundant loads and stores to fit this sequence into context. The instruction scheduler can reeorder the last four of these any way it likes. Now, clearly 5 instructions may take longer than the 1 in your 68K example. But, RISC machines are easier to pipeline, easier to speed up the clock for, easier to provide staged functional units for, etc.. I don't know of any CISC machines with 'hardwired' instruction sets. Micro- coding slows the machine down, but is typically the only way to fit a CISC on a chip. All this may mean that 5 instructions on a RISC may be _faster_ than one on a CISC.