Xref: utzoo comp.sys.next:16568 comp.arch:22261 Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!kithrup!sef From: sef@kithrup.COM (Sean Eric Fagan) Newsgroups: comp.sys.next,comp.arch Subject: Re: RISC vs. CISC -- SPECmarks Message-ID: <1991Apr26.074427.4703@kithrup.COM> Date: 26 Apr 91 07:44:27 GMT References: <1991Apr22.044553.16805@mp.cs.niu.edu> <1991Apr24.170804.25670@kithrup.COM> <1991Apr24.181932.17810@cs.cornell.edu> Followup-To: comp.arch Organization: Kithrup Enterprises, Ltd. Lines: 39 In article <1991Apr24.181932.17810@cs.cornell.edu> wayner@CS.Cornell.EDU (Peter Wayner) writes: >In another sense, going "superscalar" is much easier with CISC >machines. I think the Intel 486 does a PUSH instruction in one cycle. It executes only one instruction each cycle. How, pray tell, is that superscalar? Yes, the PUSH instruction can execute in one cycle (provided you are pushing either a "general purpose" register or an immediate; most code I've seen for the 80186 and later likes to push lots of memory locations, in which case it takes 4 cycles, not counting memory latency). However, I seem to recall that there are lots of "gotcha's" in that 1 cycle. As I do not remember them, I shall defer trying to discuss them. >In RISC land, this is a decrement and a load. The CISC designer just >needs to use enough silicon to pipeline the important instructions. In RISC land, you do a store followed by a decrement. This will execute in two cycles on a MIPS R3000, I believe (since the decrement executes while the store is waiting). On the other hand, the equivalent POP takes 4 cycles on the '486; on the R3000, the equivalent load/increment takes (tada) 2 cycles. Oooh. >There is no need for complex logic to handle all the possible cases of >two instructions coming down the pipe. The RISC designer needs to >worry about generality. And the CISC designer needs to try to think which sequences of "instructions" are going to be commonly executed, and make them work fast as a single instruction (such as PUSH). Tell me: would it be preferable to have a limited PUSH instruction execute in one cycle (limited because it can only store into a hardwired location), or to have a more general store/arithmetic sequence execute in two cycles? -- Sean Eric Fagan | "I made the universe, but please don't blame me for it; sef@kithrup.COM | I had a bellyache at the time." -----------------+ -- The Turtle (Stephen King, _It_) Any opinions expressed are my own, and generally unpopular with others.