Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!rochester!crowl
From: crowl@rochester.ARPA (Lawrence Crowl)
Newsgroups: net.arch
Subject: Re: Benchmarks in August IEEE Micro
Message-ID: <21344@rochester.ARPA>
Date: Mon, 6-Oct-86 12:31:23 EDT
Article-I.D.: rocheste.21344
Posted: Mon Oct  6 12:31:23 1986
Date-Received: Tue, 7-Oct-86 23:27:43 EDT
References: <322@oblio.UUCP> <3600003@hplabsb.UUCP>
Reply-To: crowl@rochtest.UUCP (Lawrence Crowl)
Organization: U of Rochester, CS Dept, Rochester, NY
Lines: 39

In article <3600003@hplabsb.UUCP> wiemann@hplabsb.UUCP (Alan Wiemann) writes:
>Benchmark comparisons give valid results only if the same program is presented
>to each machine.  The compiler is considered part of the "machine" and its
>performance contributes to the overall performance of the machine.  This study
>did not present the same program to each machine.  Instead "[they] had the
>same person modify or write all the tests so [they] could be sure that the same
>algorithms would be used for all the processors" (page 56 of the IEEE article).
>Thus the benchmark results reflect not only the individual processors' ability
>to execute instructions but also the cleverness of this programmer in using
>each microprocessor's instruction set and architecture.  The results reported
>should not be considered true measures of the relative performance of these
>microprocessors.

How else do you compare assembly language performance between two machines with
different architectures?  Often critical sections of code will be coded in
assembler to increase speed.  The capability of the architecture to support
fast hand-coded assembler can have a significant effect on the performance of
the program.  So we need to do assembly language benchmarks.  I submit it is a
valid comparison to code the same algorithm into assembly on each machine.  
However, this coding must be done by and individual with equivalent experience
on each machine, spending the same amount of time programming.  That is, the
programmer is not allowed to bias the results by spending unfair amounts of
time optimizing his favorite processor.  The bottom line is that we must trust
the benchmarks, correlate them with other benchmarks, or do them ourselves.

By the same token, including the is often an invalid comparison because the
compiler can have a significant effect on the resulting performance.  Suppose
I take a student built, unoptimizing compiler for machine A and a highly tuned
optimizing compiler for machine B.  Now, if the two machines are anywhere close
in performance, machine B will win.  Here again, the bottom line is that we
must trust the benchmarks, correlate them with other benchmarks, or do them
ourselves.

Of coarse, we could have a competitive benchmark between interested parties.
Any takers?
-- 
  Lawrence Crowl		716-275-5766	University of Rochester
			crowl@rochester.arpa	Computer Science Department
 ...!{allegra,decvax,seismo}!rochester!crowl	Rochester, New York,  14627