Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!columbia!rutgers!husc6!mit-eddie!genrad!decvax!mcnc!unc!rentsch
From: rentsch@unc.UUCP (Tim Rentsch)
Newsgroups: net.arch
Subject: Re: Re: Floating point performance & Mr. Mashey's Mythical Mhz
Message-ID: <103@unc.unc.UUCP>
Date: Sun, 26-Oct-86 23:14:19 EST
Article-I.D.: unc.103
Posted: Sun Oct 26 23:14:19 1986
Date-Received: Mon, 27-Oct-86 22:28:45 EST
References: <340@euroies.UUCP> <1989@videovax.UUCP> <722@mips.UUCP> <377@garth.UUCP> <727@mips.UUCP>
Reply-To: rentsch@unc.UUCP (Tim Rentsch)
Organization: CS Dept, U. of N. Carolina, Chapel Hill
Lines: 74

In article <727@mips.UUCP> mash@mips.UUCP (John Mashey) writes:
> Now, the reason one might care about MWhets/MHz (or any similar measure
> that compares the delivered real performance with some basic technology
> speed) is to understand the margin and headroom in a design.

There is a subtle pitfall in arguing that FLOPS/HZ (or IPS/HZ) is a
measure of architectural "goodness".  Certainly, measuring FLOPS/HZ
is a reasonable attempt to factor out the particulars of the device
fabrication, which are obviously irrelevant to architecture.  (If
your chip runs twice as fast as my chip only because it is 5 times
as small, your process technology is better than mine, but your
architecture may not be.)  BUT -- and here is the pitfall -- it just
might be that given identical fabrication methods, the better
FLOPS/HZ choice would still run slower because it would not support
the higher clock rate.  RISC proponents would argue that one reason
for having simple instruction sets is to *lower the cycle time* so
that the machine can run faster and get more work done.  Your
machine's FLOPS/HZ may be twice as good as mine, but if my HZ is
three times yours (in identical technology), my machine is faster --
and so my architecture is better.


> Comments? What sorts of metrics are important to the people who read
> this newsgroup? What kinds of constraints?  How do you buy machines?
> If you buy CPU chips, how do you decide what to pick?

The metrics I'm interested in measure speed.  (Basically, I'm hooked
on fast machines.)  Other constraints are less interesting because:
(1) I will buy the fastest machine I can afford, and (2) in terms of
architecture, speed is the bottom line -- all else is just
mitigating circumstances. ("I know machine X runs 3 times as fast as
machine Y, but machine X is Gallium Arsenide."  Compare
architectures, not technologies.)

Here are my favorite metrics (in no particular order):

(1) micro-micro-benchmark:  well defined task, with well defined
algorithm, hand coded in lowest level language available (microcode
if it comes to that) by arbitrarily clever programmer who can take
advantage of all machine dependencies (instruction timings, overlaps
and/or interlocks, special instructions, cache sizes, etc.).
Algorithm can change slightly to take advantage of machine
characteristics, but must be "recognizable".

(1a) same as above, but at assembly language level.  instruction set
cleverness is allowed;  microcode and special knowledge such as
cache size is not.

(2) micro-benchmark: well defined task, with algorithm given in some
particular programming language (and benchmark must be compiled from
the given algorithm).  The point here is to measure the speed of the
machine in "typical" situations, including compiler effectiveness.
the time taken to do the compile is irrelevant, as long as it is
reasonably finite.

(3) macro-benchmark: the problem with (1) and (2) is that they don't
measure all kinds of things that inevitably take place in real
systems.  (on the other hand (1) and (2) are easy to run, and also
easy to fudge, so they are more often done.)  a macro-benchmark is
like (2) in having a given program, except that the given program is
very large, so that code size is comparable to amount of real memory
on the machine (hopefully code > real memory).  now the
effectiveness of the machine for problems-in-the-large will be
measured, including things like swapping speeds and TLB hit rates,
etc.  sadly, this is a vague measure because there are so few large
programs which can be used as the benchmark, and many variable
parameters creep in (such as how fast the disk seeks are, etc.).
even so, it is worth remembering that speed in the small is
different from speed in the large, and that the latter is really
what we desire.  (or should that be, "what I desire"?  :-)

cheers,

txr