Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uwm.edu!zaphod.mps.ohio-state.edu!brutus.cs.uiuc.edu!lll-winken!maddog!brooks From: brooks@maddog.llnl.gov (Eugene Brooks) Newsgroups: comp.arch Subject: Re: The Killer Micro From Hell Keywords: cpu starvation, memory bandwidth Message-ID: <42737@lll-winken.LLNL.GOV> Date: 30 Dec 89 21:06:48 GMT References: <158@csinc.UUCP> <787@stat.fsu.edu> <42701@lll-winken.LLNL.GOV> <788@stat.fsu.edu> Sender: usenet@lll-winken.LLNL.GOV Reply-To: brooks@maddog.llnl.gov (Eugene Brooks) Organization: Lawrence Livermore National Laboratory Lines: 36 In article <788@stat.fsu.edu> mccalpin@stat.fsu.edu (John Mccalpin) writes: >So applying some scaling suggests that a 4-cpu Cray Y/MP at 6 ns will >be about 290 times as fast as the R-3000 box. Then scale the MIPS cpu So to really compare one processor to one processor, as any reasonable person would do, we divide the 290 by 4 to get a ratio of 72 for the 6 NS Y to the R3000. This is the kind of single cpu speed ratio that we see here, and expect at this point, for codes running near 100% vectorization levels. If you take the manufactuer's hint of a speed ratio of 2.5 between the R3000 and the R6000 you get a factor of 29 for the YMP vs the R6000. Now the ONE data point I have indicates that the ratio between the R3000 and the R6000 can be as good as 2.7, so I am inclined to believe the manufacturers estimate which is lower. I do not know what kind of a deal you fellows got on a Y, but an 8 processor Y with 32 megawords (thats 32 megabytes per cpu) cost (system cost, disk drives included) around 3 million per processor. Yes, we are looking at increasing the size for the memory of the one here, at a cost I don't care to mention in an open forum. The single cpu R6000 is going to be between 100K and 200K depending on whether you go for more memory per cpu, and many gigabytes of disk. The bottom line, roughly 30 times the speed for 30 times the cost for code which is fully vectorized on the Y. There is an absolute performance advantage but no cost-performance advantage. If your code is not 99% vectorized, however, you are very foolish to run it on a traditional supercomputer cpu. As you correctly point out. >This is all just as excuse to remind Eugene :-) that some users will >still be able to make effective use of vector supercomputers. In I pointed out in my posting that Killer Micros have overrun traditional supercomputers in scalar performance. I qualified this very explicitly in my posting. The notion that I need to be reminded that traditional supercomputers are still hanging in there for codes which are nearly 100% vectorized is silly. brooks@maddog.llnl.gov, brooks@maddog.uucp