Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!elroy.jpl.nasa.gov!lll-winken!uunet!convex!rosenkra From: rosenkra@convex.com (William Rosencranz) Newsgroups: comp.benchmarks Subject: Re: benchmarks (SPECmarks) Summary: why all the fuss? Keywords: validity meaning Message-ID: <108988@convex.convex.com> Date: 20 Nov 90 00:43:04 GMT References: <7581@eos.arc.nasa.gov> <1146@dg.dg.com> <7589@eos.arc.nasa.gov> <1148@dg.dg.com> Sender: news@convex.com Organization: Convex Computer Corporation; Richardson, TX Lines: 99 --- i dunno, maybe i am just daft, so ignore this if you beg to differ. it is not meant to offend, so if you read something into it, pls reread. it is also my opinion, not that of my employer... i have been reading this newsgroup for a week or so, and SPECmark is the current hot topic. i am a bit confused over some of the issues raised, so maybe i'll raise some of my own. first off: what are SPEC ratings (or any standard bm ratings for that matter) meant to do? answer this question in your mind first before proceeding... i really see no point whatsoever in relating an execution time on one machine to that of another "standard" machine, no matter how standard, (except possibly the old "that's the way we've ALWAYS done it before", e.g. "MIPS"), just to come up with some single "standard" unit of performance. if I were buying (instead of selling :-), i'd want to see wallclock and cpu times, because i, as a human being, can relate to time far easier the "SPECs" or whatever. if something runs in 10 seconds, compared to 100 seconds, i know i can sit and wait, call it "interactive". if something runs in 10 min vs 1 hour, i know i can go out to lunch in the latter case. a SPEC of 1.345 vs a SPEC of 4.345 means nothing, until i translate to time anyway. time is easier to "heft", as it were. further, i'd want to see how the "standard" bm results scale with problem size, especially on cache-based memory systems. because a buy decision based on a single number could come back to haunt me. i'd also want to know what sort of performance enhancements i could expect if i wanted to put 1 hour, 1 day, and 1 week's effort into the optimization of any particular code, if possible. i'd also want to compare a vendor's peak performance with how well it did on standard bm's or on my own. finally, i'd want to see what sort of support i can expect from the vendor. granted, pre-sales and post-sales activities can vary greatly, but i think i can shake out a vendor during the sales cycle, as most saavy buyers can. why the need for complication, other than perhaps marketing fog? and believe me, if i see 2 or 3 systems with uni-number ratings within say 5% of each other, i sure as heck would not say "these machines are identical, so let's buy the cheaper one becasue it has better price/SPECperformance". i'd want to look at the raw data anyway, and probably run my sort of workload on them to really get an idea of what i can expect. similarly, if i see two machines that differ by alot in some particular individual tests, i' want to know why. in fact, unless i expect to buy a machine to do just one job (or one job at a time), i would more than likely ignore these uni-job ratings altogether, since, from my experience, in "real life", multi-job thruput is where productivity gains are made, and is where strengths and weaknesses in architectures (e.g. cache vs widely interleaved memory) are really determined anyway (in many, if not most cases). probably without exception, the SPEC'd machines are general purpose systems, especially workstations, which would get lots of differnt tasks from text processing to dbms to finite element analysis to ... the basic problem i see with these uni-number ratings is that people can make up their minds, even subconsciously, based on a first impression. this is human nature. you always have that in the back of your mind. and it is easy to just say "2 > 1.5" rather than "based on some real workload, and on problem size, and on vendor support, and on application availability, and on whatever, 2 is not necessarily > 1.5". distilling machine performance down to one number tends to make it easy to abuse it, to misrepresent it. if in fact these sorts of performance quotients are (good faith?) attempts to enlighten, then why not enlighten thru education rather than simplification? surely we can give more credit to the intellect of people making buy decisions than that? why not a "SPECparagrah" that sheds more light? consider this my entry in the standard bm sweepstakes :-). please don't argue the merits of standards. i am well aware of the risks an benefits therein. i also know that shopping for supercomputers is different that shopping for workstations, though in my mind buying 100 w/s at $20k a pop is still spending $2M and it might be better to buy 100 w/s at $10k and a central system at $1M with my $2M. the SPEC numbers in no way help me here, i think. having spent the last 15 years dealing with supercomputers, and only 5 or 6 with workstations and pc's, i am somewhat biased, i suppose, though i like to at least think i have an open mind about these sorts of issues. personally, i think i'll wait for the SPECthroughput bm... -bill rosenkranz rosenkra@convex.com -- Bill Rosenkranz |UUCP: {uunet,texsun}!convex!c1yankee!rosenkra Convex Computer Corp. |ARPA: rosenkra%c1yankee@convex.com