Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!uwvax!oddjob!gargoyle!ihnp4!chinet!nucsrl!gore
From: gore@nucsrl.UUCP (Jacob Gore)
Newsgroups: comp.arch
Subject: Re: Benchmarking the 532, 68030, MIPS, 386...at a Usenix!
Message-ID: <3810032@nucsrl.UUCP>
Date: Sun, 17-May-87 20:40:36 EDT
Article-I.D.: nucsrl.3810032
Posted: Sun May 17 20:40:36 1987
Date-Received: Tue, 19-May-87 04:33:30 EDT
References: <2128@hoptoad.uucp>
Organization: Northwestern U, Evanston IL, USA
Lines: 54

/ nucsrl:comp.arch / larry@mips.UUCP (Larry Weber) /  8:29 pm  May 15, 1987 /
>By having each machine start 
>with a 'clean' benchmark tape we can remove all doubt about whether everyone
>used exactly the same sources and were run under the same conditions.

Then you must also put compilers (for each machine) of equal quality on that
tape.  Many benchmarks give very compiler-dependent results.

>The benchmarks should strive to illustrate how real world programs run
>on the machines.  [...]  A page thrasher would be
>wonderful BUT it is highly dependent on I/O system, configurations, page
>size, MMU ... in fact so many things that I suspect it wouldn't be useful.

Betcha my world is more real than yours! (:-)

I recently attended a presentation given by a graphics workstation company
about their next-generation workstation.  On a couple of the slides there was
a statement that went something like this:

	7 times faster than VAX 780 by the Dhrystone benchmark

Now, is that "real world"?  If I can put 20 users on a VAX 780, should I
expect to be able to put 140 users on this workstation (similarly configured)
and expect similar performance?  If I was a prospective customer who didn't
care to spend months on learning enough computerese to figure out which
benchmarks are reasonable and which are not, I probably would.  Or, at least,
I would expect to put a dozen users on it and have the thing really zip.  In
either case, I would be very disappointed -- and very misled.

It is only reasonable to compare systems if they are meant for the same
purpose.  The question is not "Is system A faster than system B?", but "Is
system A faster than system B in the environment that I will use it in?"

For example, if you are looking for a general purpose computer with a
multiuser, multiprocess system, you should be using benchmarks that create a
lot of processes.  You ARE concerned with the memory system, I/O system, etc.
If you do a lot of compilations, you should try to find a compiler that will
run (possibly, with different back ends), and compile programs of various
structure (procedure number, size, etc.).  And yes, it does depend on the
entire system in this case -- raw CPU power just doesn't cut it.  On the other
hand, if your main use was to run one program consisting of tight loops with
heavy computations in them, then that is what you look for in a benchmark.

I am very disappointed that representatives of respectful companies, that do
indeed produce very impressive stuff in their respective fields, resort to such
misleading use of "benchmarks".  After all, they can't be bypassed by their
competitors, which use these "benchmarks" too, can they?

Well, let's see...  Where's my 12 MIPS portable CD player...

Jacob Gore
Northwestern University, Computer Science Research Lab
{gargoyle,ihnp4,chinet}!nucsrl!gore
gore@EECS.NWU.Edu (for now, only from ARPA)