Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!mcnc!ece-csc!ncrcae!ncr-sd!bigbang!celerity!ps From: ps@celerity.UUCP (Pat Shanahan) Newsgroups: comp.arch Subject: Re: Benchmarking Message-ID: <203@celerity.UUCP> Date: Thu, 28-May-87 17:54:04 EDT Article-I.D.: celerity.203 Posted: Thu May 28 17:54:04 1987 Date-Received: Sat, 30-May-87 09:32:24 EDT References: <415@winchester.UUCP> <642@percival.UUCP> <426@winchester.UUCP> <2100@husc6.UUCP> Reply-To: ps@celerity.UUCP (Pat Shanahan) Organization: Celerity Computing, San Diego, Ca. Lines: 57 In article <2100@husc6.UUCP> reiter@endor.UUCP (Ehud Reiter) writes: >... >The point is, there is a great demand out there for simple, single figure >performance numbers which are in the public domain. No matter how much we >complain that single figures are meaningless, people out there in the real >world are going to continue using them. There's a reason why MIPS and >Dhrystones are so often quoted. This is very unfortunate, if true. People who believe simple, single figure performance numbers are doomed to be suprised by reality. > >And, we can do better than Dhrystone! We all know what the problems with >Dhrystone are - can't be globally optimized, too much string handling, >too small, etc. We can certainly write a benchmark which, although still >"bad", will be much better than Dhrystone. I agree. I don't know of any real C program that does as much structure assignment as the C Dhrystone. I think that C performance is important enough to justify a benchmark that reflects how the language is actually used. > >I think we can even get away with replacing single-number benchmarks by >two number benchmarks, which would give a high and low performance figure >instead of just a single performance figure (that is, the benchmark would >consist of lots of programs. The performance numbers would be normalized >against some standard (good old 4.2BSD VAX-11/780?), and the summary >statistics would be the highest and lowest of the normalized numbers). I think a better approach would be the one taken in the Livermore loops benchmark. The report includes the performance for the individual loops, as well as summary information such as the harmonic mean. I am not sure if high and low would really help much, except in convincing people that single numbers are meaningless. The extreme outliers can be due to architectural choices that are good for most programs but bad for certain exceptional programs. For example, pipelining may be good for real programs, but bad for an artifical test of jump performance. If you are going to report high and low it is very important to make all the benchmark programs reasonably mixed. If you are going to report individual results this is less critical. > >In summary, we can't write a perfect benchmark, but we can write a better >benchmark. > > Ehud Reiter > reiter@harvard (ARPA,BITNET,UUCP) > reiter@harvard.harvard.EDU (new ARPA) It should certainly be possible to write a better benchmark of C performance than the Dhrystone. -- ps (Pat Shanahan) uucp : {decvax!ucbvax || ihnp4 || philabs}!sdcsvax!celerity!ps arpa : sdcsvax!celerity!ps@nosc