Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!think!ames!amdahl!amdcad!nucleus!tim From: tim@nucleus.amd.com (Tim Olson) Newsgroups: comp.arch Subject: Re: Integer/Multiply/Divide on Sparc Message-ID: <28594@amdcad.AMD.COM> Date: 3 Jan 90 16:35:03 GMT References: <158@csinc.UUCP> <787@stat.fsu.edu> <42701@lll-winken.LLNL.GOV> <788@stat.fsu.edu> <42737@lll-winken.LLNL.GOV> <5842@ncar.ucar.edu> <34058@mips.mips.COM> Sender: news@amdcad.AMD.COM Reply-To: tim@amd.com (Tim Olson) Organization: Advanced Micro Devices, Inc., Austin, Texas Lines: 63 Summary: Expires: Sender: Followup-To: In article <34058@mips.mips.COM> mash@mips.COM (John Mashey) writes: | In article <5842@ncar.ucar.edu> thor@stout.UCAR.EDU (Rich Neitzel) writes: | > | >With all the talk about this subject I do not recall seeing any benchmarking | >of a sparc or any other system for that matter. The following table lists times | >generated by the Plum-Hall benchmark routines. (They were posted a while back | >to comp.misc.sources). There are three things that really stand out to my | Could somebody post the critical parts of this again so we can | look at it? Although I have high respect for Plum-Hall in general, | I'm always nervous about micro-level benchmarks. Now, I hate to have | to defend SPARC :-), but I must: realistic integer benchmarks | that I know [like the SPEC ones] simply don't correlate with | the results claimed below, at least not very much. | The RISC machines are noticably faster on actual integer programs.... The benchmarks over-emphasize integer modulus. For example, the benchmark that reportedly tests register-integer variables looks like: /* benchreg - benchmark for register integers * Thomas Plum, Plum Hall Inc, 609-927-3770 * If machine traps overflow, use an unsigned type * Let T be the execution time in milliseconds * Then average time per operator = T/major usec * (Because the inner loop has exactly 1000 operations) */ #define STOR_CL register #define TYPE int #include main(ac, av) int ac; char *av[]; { STOR_CL TYPE a, b, c; long d, major, atol(); static TYPE m[10] = {0}; major = atol(av[1]); printf("executing %ld iterations\n", major); a = b = (av[1][0] - '0'); for (d = 1; d <= major; ++d) { /* inner loop executes 1000 selected operations */ for (c = 1; c <= 40; ++c) { a = a + b + c; b = a >> 1; a = b % 10; m[a] = a; b = m[a] - b - c; a = b == c; b = a | c; a = !b; b = a + c; a = b > c; } } printf("a=%d\n", a); } and spends roughly 75% of its time performing the "%" operation. -- Tim Olson Advanced Micro Devices (tim@amd.com)