Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!think!ames!amdahl!amdcad!nucleus!tim
From: tim@nucleus.amd.com (Tim Olson)
Newsgroups: comp.arch
Subject: Re: Integer/Multiply/Divide on Sparc
Message-ID: <28594@amdcad.AMD.COM>
Date: 3 Jan 90 16:35:03 GMT
References: <158@csinc.UUCP> <787@stat.fsu.edu> <42701@lll-winken.LLNL.GOV> <788@stat.fsu.edu> <42737@lll-winken.LLNL.GOV> <KHB.90Jan2121328@chiba.kbierman@sun.com> <5842@ncar.ucar.edu> <34058@mips.mips.COM>
Sender: news@amdcad.AMD.COM
Reply-To: tim@amd.com (Tim Olson)
Organization: Advanced Micro Devices, Inc., Austin, Texas
Lines: 63
Summary:
Expires:
Sender:
Followup-To:

In article <34058@mips.mips.COM> mash@mips.COM (John Mashey) writes:
| In article <5842@ncar.ucar.edu> thor@stout.UCAR.EDU (Rich Neitzel) writes:
| >
| >With all the talk about this subject I do not recall seeing any benchmarking
| >of a sparc or any other system for that matter. The following table lists times
| >generated by the Plum-Hall benchmark routines. (They were posted a while back
| >to comp.misc.sources). There are three things that really stand out to my
| 	Could somebody post the critical parts of this again so we can
| 	look at it?  Although I have high respect for Plum-Hall in general,
| 	I'm always nervous about micro-level benchmarks.  Now, I hate to have
| 	to defend SPARC :-), but I must: realistic integer benchmarks
| 	that I know [like the SPEC ones] simply don't correlate with
| 	the results claimed below, at least not very much.
| 	The RISC machines are noticably faster on actual integer programs....

The benchmarks over-emphasize integer modulus.  For example, the
benchmark that reportedly tests register-integer variables looks like:

/* benchreg - benchmark for  register  integers 
 * Thomas Plum, Plum Hall Inc, 609-927-3770
 * If machine traps overflow, use an  unsigned  type 
 * Let  T  be the execution time in milliseconds
 * Then  average time per operator  =  T/major  usec
 * (Because the inner loop has exactly 1000 operations)
 */
#define STOR_CL register
#define TYPE int
#include <stdio.h>
main(ac, av)
        int ac;
        char *av[];
        {
        STOR_CL TYPE a, b, c;
        long d, major, atol();
        static TYPE m[10] = {0};

        major = atol(av[1]);
        printf("executing %ld iterations\n", major);
        a = b = (av[1][0] - '0');
        for (d = 1; d <= major; ++d)
                {
                /* inner loop executes 1000 selected operations */
                for (c = 1; c <= 40; ++c)
                        {
                        a = a + b + c;
                        b = a >> 1;
                        a = b % 10;
                        m[a] = a;
                        b = m[a] - b - c;
                        a = b == c;
                        b = a | c;
                        a = !b;
                        b = a + c;
                        a = b > c;
                        }
                }
        printf("a=%d\n", a);
        }

and spends roughly 75% of its time performing the "%" operation.
	-- Tim Olson
	Advanced Micro Devices
	(tim@amd.com)