Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!ucsd!swrinde!zaphod.mps.ohio-state.edu!wuarchive!udel!nigel.ee.udel.edu!mccalpin From: mccalpin@perelandra.cms.udel.edu (John D. McCalpin) Newsgroups: comp.unix.aix Subject: RS/6000 Model 320 FP Performance Message-ID: Date: 31 Oct 90 15:35:32 GMT Sender: usenet@ee.udel.edu Distribution: usa Organization: College of Marine Studies, U. Del. Lines: 16 Nntp-Posting-Host: perelandra.cms.udel.edu I recently typed in the high-performance matrix multiply routine from the technical report by Ron Bell. In the report he states that he gets 43 MFLOPS on a model 530 using this code. In the absence of cache misses (which should be minimal in a blocked code like this one) the model 320 should run at 80% of that speed, or about 34 MFLOPS. My own tests (with block sizes in the range of 16 to 32) show a very consistent 12 MFLOPS performance. Has anyone else run this code? Even with lots of cache misses, the 320 should be no more than a factor of about 2.5 slower than a 530, and here I am seeing a ratio of 3.6.... -- John D. McCalpin mccalpin@perelandra.cms.udel.edu Assistant Professor mccalpin@vax1.udel.edu College of Marine Studies, U. Del. J.MCCALPIN/OMNET