Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!utgpu!water!watmath!clyde!rutgers!ames!oliveb!pyramid!prls!mips!mash
From: mash@mips.UUCP
Newsgroups: comp.arch
Subject: Re: Benchmarking
Message-ID: <426@winchester.UUCP>
Date: Tue, 26-May-87 04:28:36 EDT
Article-I.D.: winchest.426
Posted: Tue May 26 04:28:36 1987
Date-Received: Wed, 27-May-87 03:06:01 EDT
References: <415@winchester.UUCP> <642@percival.UUCP>
Reply-To: mash@winchester.UUCP (John Mashey)
Organization: MIPS Computer Systems, Sunnyvale, CA
Lines: 46

In article <642@percival.UUCP> nerd@percival.UUCP (Michael Galassi) writes:
>In article <415@winchester.UUCP> mash@winchester.UUCP (John Mashey) writes:

>>That doesn't mean they're bad tests, merely that they're extremely hard
>>to do in a controlled way.  In particular, you often see radically different
>>results according to buffer cache sizes, for example.

>Benchmarks can be divided into two major categories:
>Those which exercise the processor (CPU FPU MMU etc...) and those which
>exercise the WHOLE computer (i.e. i/o system too).  For the person who
>is evaluating a CPU family for a new design I can see where the first
>class of benchmarks comes in VERY handy, but the rest of us (those who
>want to buy a computer, install UNIX, and generate accounts) the MIPS,
>FLOPS, *stones, etc that the cpu will do are rarely of much interest.
>I care much more about how the system will handle with a dozen users
>all doing real tasks (vi, cc, f77, rn, rogue, or whatever) than I do...
>I guess I don't care much about the "a lot of attributes" individualy,
>but rather how they all work together.  Give me anything that overall
>preforms well (so long as there is no intel cpu in it) and I'll be
>pleased as pie.

1) There ARE people who mostly care about computational benchmarks;
some of the CAD folks are perfect examples, as are those who run troff, etc.
But that's not the point.

2) I think most people in this newsgroup understand that system benchmarks
are important.  I'll try one more time: THEY'RE JUST HARD TO DO. That
doesn't stop people from doing them, which makes especiall good sense if they
have some job streams that really represent their loads.  We do these sorts
of benchmarks all the time; I've been doing UNIX system-type benchmarks of
one ilk or another for aa lot of years.  The trouble is, it's going to be hard
enough to agree on some compute-bound benchmarks, without the hassle of trying
to normalize all the rest of the stuff.  For example, do you normalize
on system cost?  Do you normalize memory sizes?  Do you normalize on
disk number and type?  All that we're saying is that system benchmarks
are painfully hard to get representative; there are many pitfalls
and benchmarking weirdnesses to look out for; "overall preforms well"
is a REAL hard metric, for example.

Note: I don't yet see a strong sense of agreement on a set of
CPU benchmarks that we believe.  From past experience, getting a
set of system benchmarks that people agree on will be much harder.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD:  	408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086