Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!uwm.edu!bionet!agate!eos!eugene From: eugene@eos.UUCP (Eugene Miya) Newsgroups: comp.arch Subject: Re: benchmarking Message-ID: <6336@eos.UUCP> Date: 28 Feb 90 07:43:08 GMT References: <7393@pdn.paradyne.com> <3300102@m.cs.uiuc.edu> <36438@mips.mips.COM> <132232@sun.Eng.Sun.COM> Reply-To: eugene@eos.UUCP (Eugene Miya) Organization: NASA Ames Research Center, Calif. Lines: 65 In article <132232@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes: >>In article <3300102@m.cs.uiuc.edu> gillies@m.cs.uiuc.edu writes: >> [doesn't like SPEC] > >In article <36438@mips.mips.COM> mash@mips.COM (John Mashey) writes: >>I'm sad to hear that what we've done so far is "no better than Dhrystone", >>because if that's true, a whole bunch of us have wasted, in toto, at >>least several million $ to try to do something better.... > >I, for one, think SPEC is great. Oh well. Too bad. >On the other hand, SPEC is not the end all to beat all. No benchmark >is. If I could design the ideal benchmark, I'd design something that >had a bunch of knobs that I could turn, like an I/O knob, a CPU knob, a >memory knob, etc. I don't have this, so I run several different >benchmarks that measure these sorts of things. SPEC is one, Musbus is >another, and we have several internal/proprietary benchmarks as well. >Some people don't like you to quote one figure from one benchmark - I >like to see all the figures from all the benchmarks. The more data you >have the easier it is to weed out the spikes. Sorry, John, I tend to suspect SPEC spent a lot of money. Larry is not talking about a single program. This is something I am working on parts, when I get tiny bits of time. And like most research 90% of its failure. I do not believe the future lies in simply having more numbers. More numbers can just be more confusing. You want number? Try 42. Douglas Adams published that. The fundamental idea which separates people is whether or not you believe the whole a of benchmark equals or exceeds the sum of its parts. If you believe in "magic" i.e. known optimizations, features, etc. that wholes > than parts, then you aren't scientific about the problem. A person won't get anywhere and you can posit little green men who only come on Tuesdays as to why your code runs fast. I am not saying timings of parts should sum to a whole code, but as you work on higher and higher conceptual ideas of programs, you can factor these optimization, etc. into performance. Users simply concern with pure speed will inevitably be disappointed. I can point to analogies of performance in other areas. The idea of placing a VAX under a bell jam, gold-plating a code, etc. That's all covered in an article I read after visiting the NBS entitled Foundations of Metrology in an NBS journal. There's ways of doing this, but just like the platinum bar, there's limits of usefulness: hence why we use other measuring tools, why we refine atomic clocks, etc. Until we are willing to do that with computers, benchmarking won't get far. I don't get any warm fuzzy feeling from the Nelson, the Loops, Dongarra, etc. sure their's bit of truth, but you have to be willing to consider surrogates. We want to run (with benchmarks), but we have to crawl before walking and playing. We are going to need a progression of research. But most of you don't have the time or inclination to listen, so I will go back to my hacking. Another gross generalization from --eugene miya, NASA Ames Research Center, eugene@aurora.arc.nasa.gov resident cynic at the Rock of Ages Home for Retired Hackers: "You trust the `reply' command with all those different mailers out there?" "If my mail does not reach you, please accept my apology." {ncar,decwrl,hplabs,uunet}!ames!eugene Do you expect anything BUT generalizations on the net? [If it ain't source, it ain't software -- D. Tweten]