Path: utzoo!mnetor!uunet!steinmetz!sunray!oconnor From: oconnor@sunray.steinmetz (Dennis M. O'Connor) Newsgroups: comp.arch Subject: Re: Impossible 40MHz R2000 ?? Message-ID: <8280@steinmetz.steinmetz.UUCP> Date: 21 Dec 87 19:50:41 GMT References: <8252@steinmetz.steinmetz.UUCP> Sender: root@steinmetz.steinmetz.UUCP Reply-To: sunray!oconnor@steinmetz.UUCP Organization: GE Corporate R&D Center Lines: 116 Keywords: fast faster fastest micro east of the GaAs :-) An (excellent) article by mash@winchester.UUCP (John Mashey) says : -]GENERAL -]1) Fortunately, it's not long until ISSCC ... and we understand -]the ISSCC rules that prevent prior publication ... Thanks. As one of the architects of this design, I'm just itchin' to discuss it, so I'm frustrated by the rules but understand why. I'm glad you understand too. -]2) The statements marked 1 & 2 below (on mips) could use some clarification. -] a) Remember that we consistently use 1 mips = vax11/780-with-decent -] compilers-running-variety-of-real-programs-not-dhrystones-or-toy- -] benchmarks type mips, i.e., something people can actually measure -] and evaluate. We've only got some computed VAX equivalence numbers, so I wouldn't want to tell them. -] b) 40MHZ is not apriori 40mips (of the kind we described above); -] statement 1 (way below: "raw machine MIPS") makes it clear that Dennis -] understands this, but some of the other statements seem to MIX -] apples and oranges. Note that even getting 40 native-mips from -] 40MHZ implies certain things about external cache/memory latencies, -] miss penalties, branch-handling, MMU-interference, etc. I do assume -] 40MIPS is for something other than a tight-loop of adds or nops in -] an on-chip cache. (a 25MHZ 68020 is a 12.5MIPS thing by that rule). The chip always runs 40MIPS all the time. We do have interlocks that can sometiumes require NOPs ( one O or two ?? a new debate :-), and we get cache misses very occasionally as well, so our performance is, well, usually less than 40MIPS certainly. Which is indeed why I've only spoken of raw machine MIPs. "But they're HONEST raw machine MIPs!" :-) -] c)This business has more than once seen spectacular claims,based on peak -] native-mips-ratings ... Until we see the ISSCC paper, it's hard to -] guess what actual performance might be. I certainly believe that -] a machine should be able to do it's clock rates in NOPS or ADDs; -] doing loads, stores, and branches is harder, especially for real-sized -] programs that actually miss in caches now and then. The RPM40 chip does its clock rate on loads and stores. There are no cache misses on loads and stores. Branches can cause cache misses, but barring misses we do full clock on them too. I really wish I coulkd talk about cache design, ooh its so good! But fortuneately ISSCC is not so far away. -] -]3) After ISSCC, if the paper itself doesn't reveal such information, -]perhaps you can post some real benchmark numbers. I'll post everything I can. But as I will explain later, it's probably comparing Red Delicious to Macintosh ( the apples ) to compare RPM40 to DEC, Sun, MIPS or Motorola. Different target environment. -]Also, you haven't mentioned floating point. Can you at least say if the -]ISSCC paper will discuss it? This years ISSCC does not ( I believe ) discuss the FPU. That may have to wait till NEXT years ISSCC, I fear. Even tho we've silicon of it Now. -]5) There are several ways of convincing somebody that a computer can -]achieve a given performance: -] REALITY: Here's the machine. Benchmark it and see. -] HINT: For a future machine, here are some hints about the ways in -] which it might be done. -] DESIGN: Here is what the future design looks like, and here are the -] innovations and sneaky designs we use to make it work. -] -]REALITY is always preferable: existence is a virtue: -]if I see a system remake the UNIX kernel, boot it, and then compile/run -]Spice, I believe it might even be a Real Machine, subject to any evidence -]to the contrary. This is hard to do with future designs ... -] -]... HINT alone can sound like hand-waving ... DESIGN inevitably -] discloses details ... highly proprietary ... can't do in ... comp.arch. After ISSCC I hope I can talk design : this was a non-classified DARPA project, and GE is NOT in the computer business. Maybe I'll be allowed to publish. I hope so : I think we did some great work ! -]... we look forward to ... GE ISSCC paper ... live benchmark numbers Our live benchmarks, to be applicable to our design environment, would be different from yours. Unless you've got an ATF ( Advanced Tactical Fighter) aerodymanic control surfaces controller benchmark :-). Well, now for some notes : First, I hope people caught that my article was intended to be light-hearted. I've got no bones to pick or axes to grind, I'm just EXTREMELY happy to have some of my ideas in working silicon. Feels real good. And yes I'm proud of my work. But I can't really disclose it yet, which is frustrating. The chip has IMHO some nice new things in it, which IMHO will be taken up with enthusiam when they go public ( the ideas, not the chip ). But realize : this is a MILITARY chip. Not commercial. $/MIP is not a real factor in it's design. MIPS/Watt, MIPS/sq-cm, MIPS/package where bigger drivers. Rad-Hard was a factor as well. So it may never make a UNIX kernal, or run Spice. Or be for sale without a satelites :-) So why bring it up ? Beacuse it runs at 40MHz/MIPS. Since we've actually built a 40MHz/MIPS chip, well, as John Mashey says, reality is better than design. We have dealt with some of the nasty problems CMOS encounters driving even 30pf at 40MHz. Or trying to turn a bus around ( from send to receive ) in 3ns. And it is tough. No question. Hi-speed CPUs was the original subject, I think ? Thank you, John Mashey, for an excellent reply that pointed out where my posting was unclear. I hope I've clarifed things some with this. -- Dennis O'Connor oconnor@sungoddess.steinmetz.UUCP ?? ARPA: OCONNORDM@ge-crd.arpa "If I have an "s" in my name, am I a PHIL-OSS-IF-FER?"