Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!caip!rutgers!husc6!panda!genrad!decvax!decwrl!labrea!glacier!mips!mash From: mash@mips.UUCP Newsgroups: net.arch Subject: Re: Re: Floating point performance & Mr. Mashey's Mythical Mhz Message-ID: <727@mips.UUCP> Date: Sun, 19-Oct-86 04:19:32 EDT Article-I.D.: mips.727 Posted: Sun Oct 19 04:19:32 1986 Date-Received: Tue, 21-Oct-86 21:32:10 EDT References: <340@euroies.UUCP> <1989@videovax.UUCP> <722@mips.UUCP> <377@garth.UUCP> Reply-To: mash@mips.UUCP (John Mashey) Organization: MIPS Computer Systems, Sunnyvale, CA Lines: 139 In article <377@garth.UUCP> kissell@garth.UUCP (Kevin Kissell) writes: >In article <722@mips.UUCP> mash@mips.UUCP (John Mashey) writes: >...that are familiar to John and myself and yet of interest to the newsgroup: >the MIPS R2000 and the Fairchild Clipper. An 8 Mhz R2000 has a cycle time >of 125ns. A 33Mhz Clipper has a cycle time of 30ns. Yet both are built >with essentially the same 2-micron CMOS technology. I somehow doubt that >Fairchild's CMOS transistors switch four times faster than that of whoever >is secretly building R2000s this week. The difference is architectural. (One of my colleagues got here first, hansen@mips, in 726@mips.UUCP, so I'll just add a few notes where they don't overlap too much.) There was no intent in the original posting to start a MIPS versus Clipper war [contrary to John Gilmore's posting in <1198@hoptoad.uucp>: sorry John, another Moto versus Intel battle we do not need, fun though it may be to watch!] I was only trying to be reasonably inclusive of relevant 32-bit micros. However, now that the issue has been raised..... An 8Mhz R2000 isn't pushing the technology very hard, ON PURPOSE!!! 8Mhz parts appear first, followed by 12s and 16s, for the same reasons you got 12Mhz 68020s before 16s and 25s. Also, I'm told that the 2u design doesn't push 2u technology as hard as it might have, in order to let the same design be shrunk to 1.5u and 1.2u with minimal effort. Now, the reason one might care about MWhets/MHz (or any similar measure that compares the delivered real performance with some basic technology speed) is to understand the margin and headroom in a design. Since Kevin brought the issue up, some hypothetical questions: a) Will there be 66Mhz Clippers in 2u CMOS? [To get actual performance like 16Mhz R2000 in 2u;] [If the answer is yes, I know a bunch of people, not all at MIPS, either, who have some real tough questions involving transmission-line effects, how to do ECL or other reduced- voltage-swing I/O, etc.] b) If they will be, what year will they be? [1987?] c) When will there be bigger / (more in parallel) CAMMU chips? [Because if there aren't, how are the caches going to get enough bigger to keep the delivered performance in line with the CPU clock speed improvements? (for real programs)? Chips gets faster with shrinks, but they don't magically get re-laid-out to acquire more memory. CAMMU chips have some good ideas in them, but they're not very big, especially compared with the needs of some of the real programs that people would like to run on high-performance micros. (There is some real nasty stuff lurking out there! People keep putting them on our machines, so we know....If the Clipper FORTRAN compilers just came up recently, and they haven't yet tried running 500KLOC FORTRAN programs...interesting times are ahead....) > >The Clipper was designed from fairly well-established supercomputer and >mainframe techniques.... "fairly well-established supercomputer and mainframe techniques" is interesting. I can think of 2 ways to read this assertion: a) High-performance VLSI designs should be done just like big machines. OR b) High-performance VLSI should be designed with good understanding of big machines, as well as good understanding of the tradeoffs necessary for VLSI [margin, headroom, packaging constraints, processes, etc, etc], where those are different from the design tradeoffs of the big ECL boxes. I hope Kevin meant b), which most people would agree with. > >John's guess for the Clipper is off by over a factor of two. The Clipper Thanks for the info: all I'd seen were random guesses from people around the net, and it's a useful contribution to see numbers from somebody that knows. Hopefully, we'll see more? [I assume that was DP?] >FORTRAN compiler was brought up only recently. In its present sane but >unoptimizing state, I obtained the following result on an Interpro 32C >running CLIX System V.3 at 33 Mhz (1 wait state), using a prototype Green >Hills Clipper FORTRAN compiler with Fairchild math libraries: > > Mhz Kwhet Kwhet/Mhz >Clipper 33 2920 Who cares? Kwhet/Kg and Kwhet/cm2 are of > more practical consequence. As hansen@mips noted, these are reasonable results, and I'd assume they'll improve somewhat with more mature compiler technology. Actually, this raises a set of questions that might be of general interest in this newsgroup, basically: 1) What metrics are interesting? 2) How do you define them? 3) In what problem domains are they relevant? 4) What are different constraints that people use? 5) How do different metrics correlate, specifically, are some of the simpler (easier-to-measure) good predictors of the more complex ones? For example, here are some metrics, all of which have appeared in this newsgroup at some time or other. Proposals are solicited: a) Clock rate. (Mhz) -- b) Peak Mips [i.e., typically back-to-back cached, register-register adds]. -- c) Sustained Mips ? d) Benchmark performance relative to other computers ++ e) Peak Mflops [i.e., "" "" for FP] -- f) Dhrystones g) Whetstones + h) LINPACK MFLops ++ i) Kwhets / Mflops [g/e] - j) Kwhets / Mhz [g/a] + k) Kg l) cm2 (or cm3) m) Watts n) $$ +++ o) Kwhets / Kg [g/k] p) Kwhets / cm2 [g/l] + q) Kwhets / Watt [g/m] + r) (any of the above) / $$ +++(esp if d)) --------- (-- & ++ indicate general impression of these metrics) What's interesting is that people have all sorts of different constraint combinations or optimization functions over any of these. Let me try a few examples, and solicit some more: 1) Maximize g), h) etc, subject to few constraints, i.e., for people who buy CRAYs, etc, money is (almost( no object. 2) Maximize one of the performance numbers, subject to some constraint. The constraint might be: absolute cm2 or cm3, as in some avionics things, i.e., if it doesn't fit, it doesn't matter how fast it is! $$: get me the most for some fixed amount of money, and I don't care if it's 2X faster, even if it's more cost-effective. 3) Performance may not be particularly important at all, relative to object-code compatbility, software availability, service, etc. Comments? What sorts of metrics are important to the people who read this newsgroup? What kinds of constraints? How do you buy machines? If you buy CPU chips, how do you decide what to pick? -- -john mashey DISCLAIMER: UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD: 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086