Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!usc!apple!oliveb!mipos3!omepd!mipon2.intel.com!mcg From: mcg@mipon2.intel.com (Steven McGeady) Newsgroups: comp.arch Subject: Re: 55 MIPS & 66 MIPS Message-ID: <5277@omepd.UUCP> Date: 28 Nov 89 03:13:20 GMT References: <28107@amdcad.AMD.COM> <1358@bnr-rsc.UUCP> <31329@winchester.mips.COM> <22303@gryphon.COM> Sender: news@omepd.UUCP Reply-To: mcg@mipon2.intel.com (Steven McGeady) Lines: 103 In article <28107@amdcad.AMD.COM>, tim@electron.amd.com (Tim Olson) writes: > > In article <22303@gryphon.COM> scarter@gryphon.COM (Scott Carter) writes: > | The 960 CA can issue three instructions per > | cycle to the chosen three of four execute units. I believe Intel has figures > | showing that on the average they could infact issue two instructions per clock > | _average_ [over what program set?], hence the 960CA can legitimately be called > | 66 Native MIPS average with 99 Native MIPS peak. > > The i960CA decoder can dispatch up to 3 instructions per cycle. > However, the decoder looks at 4 instructions at a time, and it appears > that the decoder cannot be loaded with the next set of 4 instructions > until the current set of instructions have all been dispatched. This is not correct. The instruction decoder contains a rolling quad-word window into which instructions are loaded (potentially) every cycle. The reason that we do not claim 99 MIPS (none of our advertising claims this number, to the best of my knowledge - those who have heard me speak hear me say jokingly that we run at 99 MIPS for "one whole cycle") - is that for three instructions to be dispatched, one must be a branch. A branch requires that a non-next line of instructions from the i-cache be loaded, and this is not accomplished at the full rate. > Intel compared its i960CA board running this benchmark suite with a > 68030 (20MHz), an i960KA(20MHz), and an Am29000(16MHz) board. > However, the board they used to benchmark the Am29000 was not designed > for performance; rather, it was designed to test the functionality of > ADAPT (Advanced Development and Prototyping Tool) hardware debuggers. This is an interesting piece of history re-invention. Step Engineering, the current manufacturer of the STEB board, received the design of the board from AMD (the board has an AMD copyright on it). Apparently, the board was designed this way because it is impossible to build a 29K system using normal DRAMs and achieve better performance. We attempted to put faster RAMs inthe STEB board, and to increase the clock speed to 20MHz, and neither worked. We chose the STEB board not because it was slow (even we didn't expect it to be so slow) but because it is the only available board with a prototyping area on which we could add an SBX connector to interface the graphics cards on which we displayed the benchmark results. > To provide a more fair comparison, I requested the benchmark sources > from Intel, to run on a 30MHz Am29000 board (manufactured by YARC > Systems). This board uses 2-way interleaved, 100ns DRAM memory for > instructions and 35ns SRAM for data. This board contains separate Instruction and Data memory (using the 29k's Hardvard bus), each of which is interleaved (according to published data I've been able to find on the board). The 30MHz 29k's are apparently hand-sorted - we know of no volume shipments of these parts. This board is in no way comparable in cost, parts-count, interface complexity, or usability to the 960CA board that was used. > I received sources for the non-proprietary benchmarks, compiled them > with the current version of the MetaWare HighC29k compiler, and ran > them on the YARC card. Here are the final results: > > [tables showing the 29k approximately at par with 960CA] We supplied Mr. Olson with the sources to these benchmarks, as an effort to bring an end to the warring that has been going on over benchmarking. In exchange for freely supplying these, Mr. Olson agreed that we would be given the resulting source code back, along with a copy of the compiler that produced it, prior to publication of the results. Mr. Olson has chosen to ignore those commitments and publish numbers without noting what compiler was used, and without providing us (or anyone else - we also supplied the benchmarks to Michael Sleator of Microprocessor Report) with the ability to check their validity. It should be noted that the 960CA benchmarks were compiled with the current GNU GCC compiler, which does *no* instruction scheduling, and thus fails to take advantage of the multiple-instruction issue capability of the 960CA. We have been working on an instruction-scheduling compiler, but it is not available for release at this time. The lesson that this has served to teach me, who argued with our marketing department that we should release these benchmarks to AMD under the noted restrictions, is that we were foolish to trust AMD's word regarding feedback of the results from the benchmarks. Thus, I place no trust in these numbers presented as representing any kind of objective reality. Furthermore, I have learned my lesson with regard to cooperating. The benchmark wars will now most certiainly be taken out of the hand of technologists and be placed back in the hands of marketing departments. I will reiterate here my advice to customers attempting to determine the relative speed of the two processors: run your own benchmarks on a board with a memory system relevant to the design you plan to build. The Yarc board's memory design is an example of the most-expensive memory system design that one can attach to the 29k - it bears no resemblance to what can be expected with a combined I&D DRAM memory system, which is where the only true comparison lies. In short, don't believe AMD's benchmark numbers, and don't believe ours. Don't believe simulators, because AMD's is well known at overstating performance. Believe your own benchmarks. And note that the STEB board is much closer to most embedded designs that the Yarc board, and that the 960 is much more usable in the average design that the 29k. S. McGeady Intel Corp. Brought to you by Super Global Mega Corp .com