Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!columbia!rutgers!husc6!mit-eddie!genrad!decvax!mcnc!unc!rentsch From: rentsch@unc.UUCP (Tim Rentsch) Newsgroups: net.arch Subject: Re: Re: Floating point performance & Mr. Mashey's Mythical Mhz Message-ID: <103@unc.unc.UUCP> Date: Sun, 26-Oct-86 23:14:19 EST Article-I.D.: unc.103 Posted: Sun Oct 26 23:14:19 1986 Date-Received: Mon, 27-Oct-86 22:28:45 EST References: <340@euroies.UUCP> <1989@videovax.UUCP> <722@mips.UUCP> <377@garth.UUCP> <727@mips.UUCP> Reply-To: rentsch@unc.UUCP (Tim Rentsch) Organization: CS Dept, U. of N. Carolina, Chapel Hill Lines: 74 In article <727@mips.UUCP> mash@mips.UUCP (John Mashey) writes: > Now, the reason one might care about MWhets/MHz (or any similar measure > that compares the delivered real performance with some basic technology > speed) is to understand the margin and headroom in a design. There is a subtle pitfall in arguing that FLOPS/HZ (or IPS/HZ) is a measure of architectural "goodness". Certainly, measuring FLOPS/HZ is a reasonable attempt to factor out the particulars of the device fabrication, which are obviously irrelevant to architecture. (If your chip runs twice as fast as my chip only because it is 5 times as small, your process technology is better than mine, but your architecture may not be.) BUT -- and here is the pitfall -- it just might be that given identical fabrication methods, the better FLOPS/HZ choice would still run slower because it would not support the higher clock rate. RISC proponents would argue that one reason for having simple instruction sets is to *lower the cycle time* so that the machine can run faster and get more work done. Your machine's FLOPS/HZ may be twice as good as mine, but if my HZ is three times yours (in identical technology), my machine is faster -- and so my architecture is better. > Comments? What sorts of metrics are important to the people who read > this newsgroup? What kinds of constraints? How do you buy machines? > If you buy CPU chips, how do you decide what to pick? The metrics I'm interested in measure speed. (Basically, I'm hooked on fast machines.) Other constraints are less interesting because: (1) I will buy the fastest machine I can afford, and (2) in terms of architecture, speed is the bottom line -- all else is just mitigating circumstances. ("I know machine X runs 3 times as fast as machine Y, but machine X is Gallium Arsenide." Compare architectures, not technologies.) Here are my favorite metrics (in no particular order): (1) micro-micro-benchmark: well defined task, with well defined algorithm, hand coded in lowest level language available (microcode if it comes to that) by arbitrarily clever programmer who can take advantage of all machine dependencies (instruction timings, overlaps and/or interlocks, special instructions, cache sizes, etc.). Algorithm can change slightly to take advantage of machine characteristics, but must be "recognizable". (1a) same as above, but at assembly language level. instruction set cleverness is allowed; microcode and special knowledge such as cache size is not. (2) micro-benchmark: well defined task, with algorithm given in some particular programming language (and benchmark must be compiled from the given algorithm). The point here is to measure the speed of the machine in "typical" situations, including compiler effectiveness. the time taken to do the compile is irrelevant, as long as it is reasonably finite. (3) macro-benchmark: the problem with (1) and (2) is that they don't measure all kinds of things that inevitably take place in real systems. (on the other hand (1) and (2) are easy to run, and also easy to fudge, so they are more often done.) a macro-benchmark is like (2) in having a given program, except that the given program is very large, so that code size is comparable to amount of real memory on the machine (hopefully code > real memory). now the effectiveness of the machine for problems-in-the-large will be measured, including things like swapping speeds and TLB hit rates, etc. sadly, this is a vague measure because there are so few large programs which can be used as the benchmark, and many variable parameters creep in (such as how fast the disk seeks are, etc.). even so, it is worth remembering that speed in the small is different from speed in the large, and that the latter is really what we desire. (or should that be, "what I desire"? :-) cheers, txr