Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!pasteur!ames!vsi1!wyse!mips!cprice From: cprice@mips.COM (Charlie Price) Newsgroups: comp.arch Subject: Re: i860 Dhrystones (was Re: i860 (N10) Floating Point Times) Summary: dhrystone needs context Keywords: i860 N10 Floating Point Dhrystones Message-ID: <15332@winchester.mips.COM> Date: 15 Mar 89 23:40:50 GMT References: <654@cimcor.mn.org> <93088@sun.uucp> Reply-To: cprice@mips.COM (Charlie Price) Organization: MIPS Computer Systems, Sunnyvale, CA Lines: 95 In article <93088@sun.uucp> garner@sun.UUCP (Robert Garner) writes: >Question: What's going on with the i860 Dhrystone/MHz ratio? > >The Intel "i860 Processor Performance" brief--Release 1.0, March 89--shows >82,900 Dhrystones/sec for version 1.1 for a scaled 40-MHz i860. >With compiler improvements and elimination of "errata on the current >stepping of the i860 processor", it says they expect to push the value to 90K. > >According to the paper, 69K Dhrystones/sec were measured on a Compaq 386/20 >add-in card with a 33-MHz i860 and 8MB of SCRAM (0-wait cycles for hits, >5-W for read miss, and 2-W for write misses). > >As a sanity check to a similar micro-architecture implementation (split >i&d caches, 1-cycle load/store), the 25-MHz R3000 value is 42,300 >Dhrystones/sec (w/ MIPS -O3, i.e., interprocedural register allocation). > >Since the i860 integer/cache micro-architecture is so similar to >the R3000 integer/cache micro-architecture, and assuming that Intel's >compiler technology is not significantly better than MIPSCo's, >shouldn't an i860 value equal a scaled R3000 value? > >Scaling the 25-MHz R3000 value up to 40 MHz gives 67,680 Dhrystones/sec. >So where did Intel get the extra 22% ? The 25 MHz R3000 can actually get 51,800 1.1 dhrystones per second. This number scales to 82,800 at 40 MHz, within 100 of Intel's figure. The 51.8K number, however, is beyond the spirit of the benchmark. What do dhrystone numbers mean? MIPS has maintained for a long time that they are not especially meaningful numbers. Our existing results show that you need a lot of context to have any idea what a number is telling you. MIPS has just released Performance Brief 3.6 (March 89 -- I assume that Mashey will post it sometime) and it has an interesting set of numbers for dhrystone. There are numbers for two different compiler releases and one number for the newer compilers with assembly language versions of strcpy() and strcmp(). dhrystone 1.1 -- M/2000-8 (25 MHz R3000) (numbers are Kilo-loops) Default opt -O -O3 -O4 1.31 compiler 32.4 (K) 39.7 42.3 45.3 2.00 compiler 32.6 (K) 39.7 43.1 46.7 2.00 with new str rtns 47.4 2.00 with new str rtns (my measurement, not in Brief) 51.8 dhrystone 2.1 -- M/2000-8 (25 MHz R3000) (numbers are Kilo-loops) Default opt -O -O3 -O4 1.31 compiler 33.0 (K) 36.7 38.8 42.8 2.00 compiler 32.4 (K) 36.7 39.4 43.2 >(1) More aggressive compiler optimizations? >MIPSCo's -O4 value, which according to MIPSCo's Performance Brief >"is beyond the spirit of the benchmark", is 45,300 Dhrystones/sec. >Scaled to 40 MHz, this still falls short at 72,480 Dhrystones/sec. MIPS believes (and says in the Performance Brief) that -O4 is beyond the spirit of the benchmark. We include the -O4 results to show what is possible, and for illumination of what a dhrystone figure without context can mean since not all quoted figures are The "-O4" optimizations give 7.1% for the 1.31 compiler, 8.4% for the 2.00 compiler with string routines in C, and 9.3% for the 2.00 compiler with strcmp() and strcpy() in assembler. If you don't know exactly what the optimizations are, it is hard to say what a dhrystone number might mean. >(2) Faster string copy/compare with graphics instructions? This can be quite important. For dhrystone 1.1 on an M/2000 assembly-language routines give 9.3% increase for -O3 (within the spirit of the benchmark), and 11.6% increase for -O4 (too much optimization). I believe that these lib routines are just assembler, not especially tuned for the dhrystone string lengths. If you wanted faster dhrystone numbers, you could get them with string routines that worked generally, but especially well for the specific length operations that dhrystone does. Again, if you don't know much about the libraries, you can't determine what the dhrystone number is telling you. The range of dhrystone figures for a single MIPS machine might be interesting because it tells you something about the compilers, but a single "dhrystones for this machine" just doesn't mean much. -- Charlie Price cprice@mips.com (408) 720-1700 MIPS Computer Systems / 928 Arques Ave. / Sunnyvale, CA 94086