Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!zaphod.mps.ohio-state.edu!mips!winchester!mash From: mash@mips.COM (John Mashey) Newsgroups: comp.arch Subject: Re: RISC vs CISC simple load benchmark; amazing ! [Not really] Message-ID: <39397@mips.mips.COM> Date: 14 Jun 90 23:01:04 GMT References: <8019@mirsa.inria.fr> <39319@mips.mips.COM> <675@sibyl.eleceng.ua.OZ> Sender: news@mips.COM Reply-To: mash@mips.COM (John Mashey) Organization: Your Organization Goes Here Lines: 44 In article <675@sibyl.eleceng.ua.OZ> ian@sibyl.OZ (Ian Dall) writes: >I can't help thinking that average speed (over an instruction mix) is, >(like most statistics) an inadequate measure. The trouble is, if you have >a multiply intensive application, it is a pain if it runs dramatically >slower than you would expect for a machine of that class. In a sense, one >would like to know the worst case "speed" as well as the "average" speed >of a machine (lots of hand waving here). Really, what you want is enough data points that you think you know not only some measure of centrality but some measure of variation, but those are not enough. You always really want enough benchmarks to see the patterns of difference: this is why SPEC has alwasy insisted on making ALL of the benchmark nubmers available, because quite different patterns can be found. The worst case performance is not all that interesting: for two cached machines with different cache organization, you can usually "prove" different ratios of relative performance by careful selection of the most relevant cache-busting code. For instance, the "compress" program is often a good example of something that will drag most machines down to DRAM speed. Another good one is, on a direct-mapped, virtual cache machine, is to copy, 1 byte at a time, between two areas that collide in the cache. This causes every single byte read to: writeback the (dirty) cache line to memory read the new cache line and then each byte write: flushes the (clean) cache line reads the new cache line writes the byte into that cache line (i.e., if you want to artifically show off a SPARC 490 at its worst, you can probably prove its slower than a 68020 with such a benchmark). Of course, any given machine can be done in this way. What is useful is to have some cases that show performance in different usage patterns: the mean&std deviation, or mean and min alone just don't tell you much about hwat's happening. -- -john mashey DISCLAIMER: UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086