Path: utzoo!attcan!uunet!husc6!bbn!rochester!pt.cs.cmu.edu!k.gp.cs.cmu.edu!lindsay From: lindsay@k.gp.cs.cmu.edu (Donald Lindsay) Newsgroups: comp.arch Subject: m88000 benchmarks Keywords: FFT, m88000, benchmark, VLSI System Design Message-ID: <1941@pt.cs.cmu.edu> Date: 14 Jun 88 16:18:05 GMT Sender: netnews@pt.cs.cmu.edu Organization: Carnegie-Mellon University, CS/RI Lines: 20 We don't have much benchmarking info yet about the Motorola 88000. However, the May issue of "VLSI Systems Design" contains a pipeline timing chart for an FFT inner loop. The (compiler generated) code does 4 loads, 4 stores, 10 single precision float calculations, and 4 other things, in 27 clocks. At 20 MHz, that's 7.4 MFLOPS. A 16KB CMMU can hold 4K floats, but they all have to be faulted in. A recent post suggested counting 10 clocks per 16 byte fault. That's 2.5 clocks per float, but since a large FFT visits each data point several times (say, 11) we can amortize the startup cost to about 1 clock per inner loop. So, "about 7 MFLOPS on an FFT benchmark" seems fair. -- Don lindsay@k.gp.cs.cmu.edu CMU Computer Science "Imitation is not the sincerest form of flattery. Payments are." - a British artist who died penniless before copyright law.