Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!zaphod.mps.ohio-state.edu!mips!winchester!mash
From: mash@mips.COM (John Mashey)
Newsgroups: comp.arch
Subject: Re: RISC vs CISC simple load benchmark; amazing ! [Not really]
Message-ID: <39397@mips.mips.COM>
Date: 14 Jun 90 23:01:04 GMT
References: <8019@mirsa.inria.fr> <39319@mips.mips.COM> <675@sibyl.eleceng.ua.OZ>
Sender: news@mips.COM
Reply-To: mash@mips.COM (John Mashey)
Organization: Your Organization Goes Here
Lines: 44

In article <675@sibyl.eleceng.ua.OZ> ian@sibyl.OZ (Ian Dall) writes:

>I can't help thinking that average speed (over an instruction mix) is,
>(like most statistics) an inadequate measure. The trouble is, if you have
>a multiply intensive application, it is a pain if it runs dramatically
>slower than you would expect for a machine of that class. In a sense, one
>would like to know the worst case "speed" as well as the "average" speed
>of a machine (lots of hand waving here).

Really, what you want is enough data points that you think you know 
not only some measure of centrality but some measure of variation,
but those are not enough.  You always really want enough benchmarks to
see the patterns of difference: this is why SPEC has alwasy insisted
on making ALL of the benchmark nubmers available, because quite
different patterns can be found.

The worst case performance is not all that interesting: for two cached
machines with different cache organization, you can usually "prove"
different ratios of relative performance by careful selection of the
most relevant cache-busting code.

For instance, the "compress" program is often a good example of something
that will drag most machines down to DRAM speed.
Another good one is, on a direct-mapped, virtual cache machine, is
to copy, 1 byte at a time, between two areas that collide in the cache.
This causes every single byte read to:
	writeback the (dirty) cache line to memory
	read the new cache line
and then each byte write:
	flushes the (clean) cache line
	reads the new cache line
	writes the byte into that cache line
(i.e., if you want to artifically show off a SPARC 490 at its worst,
you can probably prove its slower than a 68020 with such a benchmark).
Of course, any given machine can be done in this way.

What is useful is to have some cases that show performance in different
usage patterns: the mean&std deviation, or mean and min alone just don't
tell you much about hwat's happening.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086