Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!olivea!samsung!usc!zaphod.mps.ohio-state.edu!mips!mash From: mash@mips.com (John Mashey) Newsgroups: comp.arch Subject: Re: Novice question: measuring speed Message-ID: <1060@spim.mips.COM> Date: 15 Mar 91 21:43:50 GMT References: <645@ssdc?> <3516:Mar1319:50:3291@kramden.acf.nyu.edu> Sender: news@mips.COM Organization: MIPS Computer Systems, Inc. Lines: 51 Nntp-Posting-Host: winchester.mips.com In article <3516:Mar1319:50:3291@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >In contrast, MFLOPS measure some (supposedly) real amount of work >getting done. The number of floating-point operations in a typical >computation is relatively independent of the machine at hand. Of course, >MFLOPS don't tell you whether floating-point divisions are ridiculously >slow, and they don't tell you how non-floating-point computations will >run, but they're at least a bit more solid than MIPS. No they're not .... Read Hennessy & Patterson, page 43-44. The only reason why MFLOPS might sometimes mean something is that if they're (fully-qualified, i.e., FORTRAN, 64-bit) LINPACK MFLOPS, then you are actually talking about performance as measured on a specific benchmark, which is rather different than talking about MFLOPs in As has been discussed numerous times in the past: vendor-published mips-ratings are essentailly meaningless. no single number captures the performance differences among machines. Dhrystone-vax-mips almost always over-predict the performance of modern machiens relative to a VAX-11/780, compared to their performance on realistic programs. So far, the single-mips-related measure that is available for a wide number of machines, and has some consistency and predictive value, in my opinion, is the SPEC-integer subset, because: the 4 programs are enough larger than Dhrystone and such to avoid silly cache effects. (They're still not quite big enough, perhaps). they're real programs, and hard to compiler-gimmick. they're fairly consistent, i.e., the VAX-relative variance is fairly low. (The above are reasonably verifiable. In addition ,they correlate reasonable well with some larger, more stressful, but proprietrary benchmarks that lots of use inside computer companies.) The SPEC FP subset is also useful, although the benchmark-to-benchmark variance is much higher, and hence you need to do more of your own benchmarking to figure out which of yours calibrate with them. (This is the nature of FP programs, in general.) In general, Chapter 2 of Hennessy & patterson is good to read. -- -john mashey DISCLAIMER: UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems MS 1/05, 930 E. Arques, Sunnyvale, CA 94086