Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ulowell!masscomp!hanko From: hanko@masscomp.UUCP (Jim Hanko) Newsgroups: comp.arch Subject: Re: i860 Dhrystones Keywords: i860 N10 Floating Point Dhrystones Message-ID: <955@masscomp.UUCP> Date: 16 Mar 89 16:56:11 GMT References: <654@cimcor.mn.org> <93088@sun.uucp> <701@pcrat.UUCP> <93452@sun.uucp> <15074@winchester.mips.COM> <210@intelca.intel.com> <15226@winchester.mips.COM> Reply-To: hanko@masscomp.UUCP (Jim Hanko) Organization: Concurrent Computer Corp. - Westford, Ma Lines: 47 In article <15226@winchester.mips.COM> mash@mips.COM (John Mashey) writes: >In article <210@intelca.intel.com> clif@intelca.intel.com (Ken Shoemaker) writes: >... >>The i860 CPU benchmark report had a TYPO the Dhrystone benchmark used >>the Greenhill C compiler not FORTRAN. >>My speculation (note the word speculation) as to why the the Dhrystone >>numbers are so good is: ... >> >> 128-bit loads for string instructions > > >2) OK, I give up. There must be something unbelievably clever going on >to use 128-bit loads for C-language string operations. ... >... For a fair test, you MUST ^^^^^^^^^ >use str* that only assume byte alignment of operands, and >you can't inline the str*. ... > >3) Anyway, various people at various companies still can't figure >out why the number can reasonably be this high, under the >normal rules, UNLESS there's some really slick trick for >getting strcpy and strcmp down around 2 cycles/byte. A couple of years ago I investigated the output of the Green Hills C compiler on the Dhrystone benchmark (for a different architecture). I remember being somewhat surprised to see that the compiler had inlined the strcpy calls. It could do this since most of the calls were of the form: strcpy(x, "a constant string"); I believe that it did not actually copy the bytes from memory but loaded long immediate values and stored them. Although strcpy is extensively called with string constants in Dhrystone, this is relatively rare in real programs. Therefore, such a compiler feature seems to be targeted specifically to Dhrystone. I can't say that the Intel version of the compiler has this "optimization" (or if it did that Intel knew about it), but this may explain the high numbers. Can anyone with access to the compiler check this? I think it would clearly be unfair to compare Dhrystone numbers where this trick was used to those where a strcpy subroutine was called. - #include Jim Hanko {uunet|decvax|harvard|mit-eddie}!masscomp!hanko