Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!uunet!seismo!sundc!pitstop!sun!decwrl!pyramid!prls!mips!sjc
From: sjc@mips.UUCP (Steve "The" Correll)
Newsgroups: comp.arch
Subject: Re: MIPS Performance Brief, zillions of numbers, very long
Message-ID: <865@quacky.UUCP>
Date: Mon, 2-Nov-87 16:13:15 EST
Article-I.D.: quacky.865
Posted: Mon Nov  2 16:13:15 1987
Date-Received: Fri, 6-Nov-87 21:04:23 EST
References: <861@winchester.UUCP> <2993@phri.UUCP> <864@tut.cis.ohio-state.edu>
Lines: 54
Keywords: benchmarks

In article <864@tut.cis.ohio-state.edu>, manson@tut.cis.ohio-state.edu (Bob Manson) writes:
> Exactly what are these things supposed to mean?
> Well, they are compiled programs on different systems that are run and
> supposed to represent the speed of the various processors, MIPS etc. We
> all know that MIPS is mostly a meaningless figure...
> So why run benchmarks in compiled languages??? It's easier that way, you
> don't have to write individual programs for each machine that might actually
> show off their abilities and improved instructions. I'll admit that it does
> mean something for compiled language users but not really for performance
> comparisons-you're comparing apples and oranges, or really the efficiency
> of the compilers on the machines. 

The best benchmark for person x is clearly the program which accounts for most
of the cycles that person executes. But there are so many different "x"s!

We assume that grep, nroff, and Unix system calls are important to most
readers of comp.arch, so we study them.  A sizeable class of Fortran users
tells us that Linpack and the Livermore loops are representative of their
programs, so we study them. IC circuit designers tell us they execute most of
their cycles within Spice, so we pay a lot of attention to that.

If, on the other hand, you execute most of your cycles within hand-tuned
assembly language, and you are willing to revise your programs completely to
best use the instruction set of each new machine you acquire, and you are a
serious potential customer, the sales people at most computer vendors will be
happy to run your own specific benchmarks; ours do so all the time.

Tuned code is the right measurement for some people; compiled code is right
for others.

I view a computer as a system, so for me it makes poor sense to omit the effect
of compilers. And since one can argue forever about how well a hypothetical
compiler _might_ use a particular instruction set, I prefer to ask how well the
best existing compiler _does_ use it. While I'm all in favor of hand-coding
inner loops and library routines to improve performance, one can argue forever
about how easy and how profitable that is; so I think the best test of that
is to measure the effects of the tradeoffs that people made in constructing
an actual OS and compiler system, rather than a hypothetical one.

An article in IEEE Micro some time back which measured assembly-coded
algorithms on 68xxx and xx86 machines seemed pretty useless to me, more
like a contest between assembly coders than an indication of the useful
work I might get out of the machines when running Unix or any other OS.

Incidentally, as explained in the Performance Brief, our definition of "mips"
is _not_ the meaningless "millions of instructions per second"; it's "number
of times faster than a Vax 780 on this particular problem", where we
arbitrarily declare a Vax 780 to have a mips rating of 1. We would be better
off using "Vax780s" rather than "mips" as our unit of measure, except that
we'd have to put so many "TM"s in the document that you wouldn't be able to
find the numbers. :-)

-- 
...decwrl!mips!sjc						Steve Correll