Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!mips!mark From: mark@mips.com (Mark G. Johnson) Newsgroups: comp.benchmarks Subject: Which benchmarks are useless? Message-ID: <2502@spim.mips.COM> Date: 20 Apr 91 14:26:20 GMT References: <1991Apr20.083301.28886@ux1.cso.uiuc.edu> Sender: news@mips.COM Lines: 41 Nntp-Posting-Host: hal.mips.com In article <1991Apr20.083301.28886@ux1.cso.uiuc.edu> andreess@mrlaxa.mrl.uiuc.edu (Marc Andreessen) writes: >I've been reading this newsgroup since its formation. It spends >about 95% of its time spewing out a plethora of meaningless numbers >on meaningless and trivial little pseudo-hacks which don't deserve >to be called benchmarks by any stretch of the imagination. > >These numbers are useless. > Useless? Meaningless? I would suggest that the words "useless" and/or "meaningless" be reserved for benchmarks that produce results that are uncorrelated (absolute value of correlation coefficient < 0.2) with "correct benchmark results". Whatever "correct benchmark results" are. Since Andreessen is at the University of Illinois, let's pretend, temporarily, that the Illinois Perfect Club is the definition of a correct benchmark. Then a useless benchmark is one which, after being run on a large subset of the same machines as have run the Illinois Perfect Club, produces a correlation coefficient r in the range (-0.2 < r < 0.2) -- that is, the candidate benchmark's results are uncorrelated with the Illinois Perfect Club results. We could go further and define a "misleading" benchmark as one which produces a correlation coefficient that is large and negative, i.e. ranks machines in the wrong order (compared to the Illinois Perfect Club, our temporary definition of a "correct benchmark"). Unfortunately, the "Dates Per Second" benchmark is attempting to investigate OS behavior, something the Illinois Perfect Club doesn't measure directly. So before declaring the DatesPerSecond benchmark to be "useless", we need to define what the correct results are, (measuring OS behavior), and then we can correlate the DatesPerSecond results to the correct results. Lacking such reference data, it is premature and perhaps a bit unscientific to assert the DatesPerSecond results are "useless". -- -- Mark Johnson MIPS Computer Systems, 930 E. Arques M/S 2-02, Sunnyvale, CA 94088-3650 (408) 524-8308 mark@mips.com {or ...!decwrl!mips!mark}