Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!mips!winchester!mash From: mash@mips.COM (John Mashey) Newsgroups: comp.benchmarks Subject: Re: SPEC vs. Dhrystone Message-ID: <44485@mips.mips.COM> Date: 3 Jan 91 22:52:11 GMT References: <44342@mips.mips.COM> <15379@ogicse.ogi.edu> <44353@mips.mips.COM> <1685@marlin.NOSC.MIL> <15546@ogicse.ogi.edu> <44465@mips.mips.COM> <44466@mips.mips.COM> <15577@ogicse.ogi.edu> Sender: news@mips.COM Reply-To: mash@mips.COM (John Mashey) Distribution: comp.benchmarks Organization: MIPS Computer Systems, Inc. Lines: 23 In article <15577@ogicse.ogi.edu> borasky@ogicse.ogi.edu (M. Edward Borasky) writes: >In article <44466@mips.mips.COM> mash@mips.COM (John Mashey) writes: >>To my knowledge, no one has so far found anything quite like these things >>in the SPEC benchmarks. The closest case is matrix300, where most of the >>time is spent inside a small loop. This benchmark was included, >>rightly or wrongly, as it at least thrashed the data cache around >Correct me if I'm wrong, but the matrix300 is Jack Dongarra's 300- >equation LINPACK benchmark, isn't it? If so, the inner loop is matrix >multiply, unrolled 1, 2, 4, 8 and 16 ways. At least one compiler (FPS >Model 500) recognizes the 1-way unrolled case as a matrix-vector >multiply and calls in a hand-optimized matrix multiply routine. In >this case, cache is not relevant because the hand-optimized routine >knows about the memory hierarchy. Well, like I said, that's why matrix300 was on the edge. However, I don't even mind this too much, if the compiler recognizes the case generically (as opposed to special-case recognition), given that certain kinds of code do really spend a lot fo time in such loops. -- -john mashey DISCLAIMER: UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086