Path: utzoo!utgpu!watserv1!watmath!att!linac!pacific.mps.ohio-state.edu!zaphod.mps.ohio-state.edu!think.com!paperboy!meissner From: meissner@osf.org (Michael Meissner) Newsgroups: comp.benchmarks Subject: Re: More issues of benchmarking Message-ID: Date: 29 Nov 90 20:52:19 GMT References: <19040001@orac.HP.COM> <4344@awdprime.UUCP> <9516@darkstar.ucsc.edu> Sender: news@OSF.ORG Organization: Open Software Foundation Lines: 51 In-reply-to: zeeff@b-tech.ann-arbor.mi.us's message of 29 Nov 90 14:56:22 GMT In article zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes: | Path: paperboy!snorkelwacker.mit.edu!apple!usc!samsung!b-tech!zeeff | From: zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) | Newsgroups: comp.benchmarks | Date: 29 Nov 90 14:56:22 GMT | References: <19040001@orac.HP.COM> <4344@awdprime.UUCP> <9516@darkstar.ucsc.edu> | Organization: Branch Technology | Lines: 11 | | >Wasting more bandwidth on a trivial benchmark. | >Running my Sparcstation SLC on | > echo 2^5000/2^5000 | /bin/time bc | >yielded | > 44.5 real 43.4 user 0.1 sys | | That's about (45.9) what it yields on a 3/50. I suspect that this | isn't a very good benchmark :-). On a DECstation 2100, you get: 13.4 real 12.6 user 0.2 sys I suspect the reason Sparcstation's get such lousy time compared other equivalent systems, is the fact that the hardware does not do integer multiply and divide, and it makes a Sun3 competitive in those cases. Checking the 4.3-tahoe sources, shows that there is no floating point code in bc, and everything is done via integers. Most integer multiplies in real world code are because of array references, and amenable to using shift/add combinations if constant, or strength reduction if not. However, as I noted in comp.lang.misc, some compilers (like GCC) lose sight of the fact that a multiplication is taking place if it has transformed the multiply internally into a function call, so strength reduction does not get invoked. While I'm on the subject of strength reduction, I noted that when I worked on GCC for the 88k, that strength reduction was rarely a win. This is because the 88k has a reasonably fast integer multiply instruction (4 clocks), and it has funcitonal units that operate in parallel. Futhermore, strength reduction tends to require one additional register, which means if you are close to running out of registers in the inner loop, something now has to live on the stack in that previously lived in a register. Dhrystone 2.1 in fact, showed up this problem (I guess it's useful after all :-). -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142 Considering the flames and intolerance, shouldn't USENET be spelled ABUSENET?