Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cwjcc!gatech!gitpyr!loligo!mccalpin From: mccalpin@loligo.uucp (John McCalpin) Newsgroups: comp.arch Subject: Re: Don't look back Message-ID: <7330@pyr.gatech.EDU> Date: 21 Feb 89 14:58:10 GMT References: <13582@winchester.mips.COM> <20667@lll-winken.LLNL.GOV> Sender: news@pyr.gatech.EDU Reply-To: mccalpin@loligo.cc.fsu.edu (John McCalpin) Organization: Supercomputer Computations Research Institute Lines: 38 In article <20667@lll-winken.LLNL.GOV> brooks@maddog.llnl.gov (Eugene Brooks) writes: >In article <13582@winchester.mips.COM> mash@mips.COM (John Mashey) writes: >A long, and well founded, analysis of why superminis are being squeezed out >of their performance niche from the rear by VLSI based machines. > >This article is conservative at best, there are a whole lot of users of Cray >time buying the latest VLSI based machines as a more cost effective alternative >With the latest microprocessors these machines are within 1/5th of the >performance of a Cray supercomputer for all but the most highly vectorized >codes. For scalar codes the performance of these microprocessors can be >as high as 1/2 of a Cray-1S. I have had a great deal of trouble believing the poor performance of "supercomputers" on scalar code lately. I just ran the LINPACK 100x100 test of the FSU ETA-10 (10.5 ns=95 MHz) and got a result of 3.8 64-bit MFLOPS for fully optimized (but not vectorized) code. I used the version of the code with unrolled loops. This performance is EXACTLY the same as the MIPS R-3000/3010 pair running at 25 MHz. I understand that there must be tradeoffs, but considering the difference in cost, this is a bit surprising.... Of course, the vectorized version runs at 60 MFLOPS on the ETA-10 now (90 MFLOPS with the 7 ns CPU's), and gets rapidly faster for larger systems. I don't mean to pick on CDC/ETA --- even the fastest Cray's are going to get caught by the highest performance RISC chips pretty soon. I haven't seen any MC88000 results yet, but it looks to be able to put out results in the same performance range. Does anyone know if the memory bandwidth of the 88000 is going to able to keep the floating- point pipeline filled? This could push the performance of the 88000 up to closer to 10 MFLOPS.... ---------------------- John D. McCalpin ------------------------ Dept of Oceanography & Supercomputer Computations Research Institute mccalpin@masig1.ocean.fsu.edu mccalpin@nu.cs.fsu.edu --------------------------------------------------------------------