Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!uunet!steinmetz!davidsen
From: davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr)
Newsgroups: comp.arch
Subject: Re: Towards A Meaningful Performance Measure
Message-ID: <7772@steinmetz.steinmetz.UUCP>
Date: Thu, 5-Nov-87 11:17:30 EST
Article-I.D.: steinmet.7772
Posted: Thu Nov  5 11:17:30 1987
Date-Received: Sun, 8-Nov-87 03:45:44 EST
References: <861@winchester.UUCP> <2993@phri.UUCP> <864@tut.cis.ohio-state.edu> <3806@sol.ARPA>
Reply-To: davidsen@crdos1.UUCP (bill davidsen)
Organization: General Electric CRD, Schenectady, NY
Lines: 53
Keywords: benchmarks

In article <3806@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:
[ ... ]
|Unfortunately, that is not enough.  We must define what configuration of Vax
|we use as the baseline.  I suggest an 11/780 with full memory and a floating
|point accellerator.  CPU oriented benchmarks should run completely in physical
|memory.
|
|The compiler and operating system also affect performance.  To make the base
|machine highly available, both should be common.  I suggest Unix BSD 4.2 as
|the base operating system and the portable C compiler as the base C language
|compiler.  This allows realistic Unix/C benchmarks like grep, nroff, etc.  Note
|that such benchmarks must have the same source.  Putting a better compiler on
|the Vax will increase its relative performance, so DEC can honestly sell a 780
|as having a Vax Relative Performance greater than one.

I think this depends on what you want to test; if you want to run 4.3BSD
it's a good way to test, if you want to know how fast your C and FORTRAN
programs will run, you should use the best compilers, etc, if that's
what you want to know. I suspect it is in most cases.

Programs which measure the raw speed of the hardware will give results
which often don't match the high level language results. This doesn't
imply that either is wrong, but that you have to know what you want to
measure.

I have a benchmark suite which I use for UNIX (about 70 machines so
far), and I run with the default compiler and whatever you get with "-O"
for an option. I may repeat with other optimization options if
available, and often see a major change in performance, not always for
the better.

Among other things, I measure the highest scalar speed available from C
for short, long, float, and double. I measure speed of transcendental
functions and the time to do a compare and branch for integer and float.
I do a turing machine simulation, grey to binary and binary to grey. If
I have the machine to myself I run multitasking benchmarks, and if it
has vector capability I test that also. I look at the compile speed
also, and disk performance (to see what I get by using a "better"
drive).

What this suite tells me is the profile of capability for the machine.
There is no one number I can find which is a meaningful index of
performance, and if pressed I use the realtime to run the entire test,
relative to a VAX 11/780. This is as meaningful as any other one number.

I suggest that benchmarks using the "standard" are only valid if you are
testing the machine performance rather than the typical time it takes to
do things using the best tools on a system. Even then the PCC
performance varies from machine to machine in quality, etc.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me