Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!lll-winken!ames!bionet!agate!eos!eugene
From: eugene@eos.arc.nasa.gov (Eugene Miya)
Newsgroups: comp.benchmarks
Subject: Re: benchmark evaluations
Message-ID: <7694@eos.arc.nasa.gov>
Date: 14 Dec 90 08:14:43 GMT
References: <12220@hubcap.clemson.edu> <1990Dec12.135910.27667@cs.utk.edu>
Reply-To: eugene@eos.UUCP (Eugene Miya)
Organization: NASA Ames Research Center, Calif.
Lines: 33

In article <1990Dec12.135910.27667@cs.utk.edu> Dave Sill <de5@ornl.gov> writes:
>In article <12220@hubcap.clemson.edu>, mark@hubcap.clemson.edu
(Mark Smotherman) writes:
>>1) Representative
>Only important if the results are going to be used to predict the
>performance of the system on other code.

Wrong!  Representativeness is need for any descriptive or diagnostic
system.  Prediction is icing on a cake.

>>2) Reproducible
>Not necessary in all cases, e.g., informal testing or repeated tests
>of the same configuration.

Reproducibility is a hallmark of all good sciences.
See, The Journal of Irreproducible Results (maybe all benchmarks
deserve to be there?).

>full, rigorous suites such as SPEC.

If one benchmark is not adequate, and 2 aren't enough,
when is enough, enough?  42? 700?  I don't think the answer lies solely
in fixed benchmarks.

>I'd try to relate various sets of criteria with the different tasks
>benchmarks are used for.  There's no "one size fits all" set of
>criteria.

I will agree with this.

--e.n. miya, NASA Ames Research Center, eugene@eos.arc.nasa.gov
  {uunet,mailrus,most gateways}!ames!eugene
  AMERICA: CHANGE IT OR LOSE IT.