Path: utzoo!mnetor!uunet!husc6!bbn!mit-eddie!uw-beaver!cornell!batcomputer!itsgw!steinmetz!sungoddess!oconnor From: oconnor@sungoddess.steinmetz (Dennis M. O'Connor) Newsgroups: comp.arch Subject: Re: RPM-40 microprocessor @ 40 MHz; dat Message-ID: <9852@steinmetz.steinmetz.UUCP> Date: 8 Mar 88 20:06:33 GMT References: <9792@steinmetz.steinmetz.UUCP> Sender: news@steinmetz.steinmetz.UUCP Reply-To: sungoddess!oconnor@steinmetz.UUCP Organization: GE Corporate R&D Center Lines: 137 An article by bcase@apple.UUCP (Brian Case) says: ] In article <...> oconnor%sungod@steinmetz.UUCP writes: ] >An article by bcase@apple.UUCP (Brian Case) says: ] My complaints about the RPM40 are architectural: having 16-bit ] instructions may be a slight advantage now, but I predict it will come ] back to haunt. In my opinion, and according to the information I have ] gotten from postings here and a friend who attended the ISSCC, I have ] seen little to convince me that the RPM40 is showing us how to do it ] right, in the architectural sense. That, in a nutshell, is my beef. ] Running UNIX or dhrystone quickly is not the main issue; this is a ] forum concerned with architectural issues. Of course, architecture can ] influence how fast dhrystone is run, and implementation can often mean ] more than anything else. Architecture limits implementation. Therefor, an architecture that's relatively fast for 100-transistor-per-chip technology will probably not be realtively fast for VLSI. But we all know this. ] I honestly think the absolute best thing you could do right now is to ] post a bullet-list of "features" of your machine. This will put an end ] to questions like "well, how much do you really know about the RPM40?" ********** In My "Humble" Opinion ********************************* Things done right on RPM40, tho not neccesarily for the first time : Harvard architecture, but with a shared address bus that DOES NOT need to place more than one address on the bus per cycle. Sending only branch target addresses off chip, instead of sending EVERY instruction address off chip. Using a pipelined look-ahead to provide the instruction stream. Using the cache only to fill in for the look-ahead system's latency on branches. Forwarding coprocessor instructions from the CPU I-cache. Prefix instructions. COND (also called "SKIP") instructions. Pipelined Operand Memory system Fast interupt handling Using only a two-phase clock. Using a shorter pipeline for non-load instructions. 21 g.p. registers, up to 15 of which can be used as base registers at any one time. No time-division multiplexing of pins during a clock cycle. ] >The RPM40 runs 40MIPS, all the time, all instructions (even NOPS :-), ] ] With the memory system you assume, the Am29000 and I guess the R2000 would ] run MIPS at their clock rates as well. Well, you are incorrect. The MIPS chip, correct me if I am wrong, needs a four-phase 32-MHz clock to execute 16MIPS (native,peak). The Am29000, I beleive, uses 25ns RAM just to make 25MHz, I don't know how many phases, and therefor I believe 25MIPS. Putting 25ns RAM on an R2000, it would still only execute at 16MIPS. The processor is not fast enough to take advantage of it. The Am29000 needs 25ns RAM just to run at 25MIPS. ] The question is how long it takes to get from start of program to ] finish of program. If the RPM40 is exeucting more loads and stores ] and more register to register moves to make up for the relatively ] small number of registers and lack of three-address instructions, ] etc., then you aren't getting all the bang out of your 40 MHz. On the ] other hand, if it *is perfect for your application* then great. "Small number of registers"?? 21 G.P. registers is small ? Says who ? Talk to compiler writers : they tell us that 16 is just fine. Or maybe your thinking of the Berkelly(sp?)-style register window concept ? The R2000 doesn't have that. I think maybe the Am29000 does ?? ] This is where I don't appologize for saying something. I'll confine the ] following discussion to the Am29000 since I know it well: With similar ] very fast memories (as the RPM40 assumes), the Am29000 will have the ] advantage in fewer loads/stores, faster procedure calls, and fewer ] instructions executed (three-address instructions and lots of registers). ] The RPM40 has the advantage in clock rate. Who wins? I don't know, but ] I doubt the difference is tremendous, especially if the RPM40 is required ] to have a TLB as the Am29000 does. WEll, beyond arguing that a TLB may not slow it down, which contract prevents me from discussing, I'll say this : applications that don't need a TLB shouldn't pay for a TLB. ] ... the RPM40 must be evaluated with a TLB in order to be ] compared to most other chips. Like the MC680[012]0 family ?? 1750A processors ?? AN/YUK-14's ?? None of these have TLBs. ] Incidentally, I think MIPS would rather have the R2000 known as a 10 MIPS ] machine at 16 MHz (not the 8 MIPS you quoted). Actually, I think MIPS Inc. actually claims a 10 Vax-MIPS rating for their 16-native-peak-MIPS processor, that uses a 32MHz clock. Which places addresses on the address bus once every 30ns. THAT's why "MHz" is TOTALLY inappropriate, WORSE than native-peak MIPS, even. An RPM40 at 32MHz would also place addresses on the address bus once every 30ns, but would execute 32-native-peak-MIPS. What's the smallest signal interval on a 25MHz Am29000 ? In the RPM40, NO signal ever assumes more than one valid state during a cycle. This is not true of the R2000. Is it true of Am29000 ? ] In your reponse to my response, you go on to say that we should not judge ] performance by either peak native instructions per second or MHz. I don't ] know anyone here who would dissagree with you (except marketing people: ] what else can they say?). In my claim above, I adhered to just that ] philosophy. This also is what most manufacturers of concern to us here ] strive for (esp. MIPS Co.). All three need to be paid attention to. They make big differences. For instance, native-MIPS-per-MHz can range from 5 or less in a CISC machine, to about 1 for a RISC, to 65K or more for a big parrallel machine. And there's only so fast any particular technology will let you run the clock, so it DOES matter. -- Dennis O'Connor oconnor%sungod@steinmetz.UUCP ARPA: OCONNORDM@ge-crd.arpa (-: The Few, The Proud, The Architects of the RPM40 40MIPS CMOS Micro :-)