Path: utzoo!utgpu!news-server.csri.toronto.edu!clyde.concordia.ca!uunet!mcsun!ukc!dcl-cs!aber-cs!odin!pcg From: pcg@odin.cs.aber.ac.uk (Piercarlo Grandi) Newsgroups: comp.arch Subject: Re: Black magic, IBM RIOS. Message-ID: Date: 14 Apr 90 12:39:02 GMT References: <1990Apr4.140713.8996@specialix.co.uk> Sender: pcg@aber-cs.UUCP Organization: Coleg Prifysgol Cymru Lines: 27 In-reply-to: flee@shire.cs.psu.edu's message of 10 Apr 90 19:33:48 GMT In article flee@shire.cs.psu.edu (Felix Lee) writes: Piercarlo Grandi wrote: > register timings for the 3240 (and the MIPS inner loop is 4 > instructions) imply 16M*4 instructions in 1 second, that is 64 MIPS. I > cannot believe that the 25 Mhz R3000 os superscalar as well. Be careful when you examine the machine code. On a MIPS RC3240, the inner loop is unrolled, so it only gets executed 4M times, not 16M, and it contains 6 instructions, not 4. This is 4M*6 instructions in 1.0s, which is 24 native MIPS, a reasonable number. Precisely my point. Actually the difference was that I was using the 1.31 release of the MIPS compiler, that does not do loop unrolling, while Pettit probably was using the 2.1 version, that does loop unrolling. As I have had to repeat many times, the purpose of my benchmark is to see how the CPU and memory subsystem perform, not the compiler, so you want to look at the generated code. If you drag the compiler into the act, and look only at the runtime, you are assuming that this is a language level benchmark, which is a very silly (or very disingenuous -- like the IBM/DEC posting of dhrystone 1.1 data) assumption, and one that I have carefully, loudly disclaimed many times. -- Piercarlo "Peter" Grandi | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcvax!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk