Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!emory!wuarchive!udel!nigel.ee.udel.edu!mccalpin From: mccalpin@perelandra.cms.udel.edu (John D. McCalpin) Newsgroups: comp.benchmarks Subject: Re: Price/Performance figures for Number-Crunching Message-ID: Date: 19 Mar 91 17:49:38 GMT References: Sender: usenet@ee.udel.edu Followup-To: comp.benchmarks Organization: College of Marine Studies, U. Del. Lines: 31 Nntp-Posting-Host: perelandra.cms.udel.edu In-reply-to: mccalpin@perelandra.cms.udel.edu's message of 18 Mar 91 21:59:12 GMT > On 18 Mar 91 21:59:12 GMT, (mccalpin@perelandra.cms.udel.edu) I wrote: Me> Here is a table with some info I have derived from Jack Dongarra's Me> latest LINPACK report. Me> MFLOPS MFLOPS MFLOPS MFLOPS Price MFLOPS/Million$ Me> System Peak Max Lnpk Stream $10**6 Max Stream Me> ----------------------------------------------------------------------- Me> Cray Y/MP-1 333 324 25 150 ~3.0 108 50 ^^^^ ^^^^ Aaarggh! That should be 90 MFLOPS, not 25 MFLOPS! I would have caught this if I had used it for one of the price/performance calculations.... By the way, the "MFLOPS Stream" is not derived from the LINPACK report, but from lots of other sources. It is close to the maximum sustainable memory bandwidth in MBytes/sec divided by 24 MB/sec (which required to sustain 1 MFLOPS of long 64-bit vector dyads). On some machines I use my own observations of the "maximum sustainable memory bandwidth" rather than the manufacturers specified "memory bandwidth". This results in some slightly lower, but more generally accurate numbers. For example on the Cray, the theoretical peak streaming speed is 166 MFLOPS/cpu, but in real FORTRAN code, it is difficult to exceed 150 MFLOPS/cpu for long vector dyads. -- John D. McCalpin mccalpin@perelandra.cms.udel.edu Assistant Professor mccalpin@brahms.udel.edu College of Marine Studies, U. Del. J.MCCALPIN/OMNET