Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uwm.edu!zaphod.mps.ohio-state.edu!brutus.cs.uiuc.edu!lll-winken!maddog!brooks
From: brooks@maddog.llnl.gov (Eugene Brooks)
Newsgroups: comp.arch
Subject: Re: The Killer Micro From Hell
Keywords: cpu starvation, memory bandwidth
Message-ID: <42737@lll-winken.LLNL.GOV>
Date: 30 Dec 89 21:06:48 GMT
References: <158@csinc.UUCP> <787@stat.fsu.edu> <42701@lll-winken.LLNL.GOV> <788@stat.fsu.edu>
Sender: usenet@lll-winken.LLNL.GOV
Reply-To: brooks@maddog.llnl.gov (Eugene Brooks)
Organization: Lawrence Livermore National Laboratory
Lines: 36

In article <788@stat.fsu.edu> mccalpin@stat.fsu.edu (John Mccalpin) writes:
>So applying some scaling suggests that a 4-cpu Cray Y/MP at 6 ns will
>be about 290 times as fast as the R-3000 box.  Then scale the MIPS cpu
So to really compare one processor to one processor, as any reasonable person
would do, we divide the 290 by 4 to get a ratio of 72 for the 6 NS Y to the
R3000.  This is the kind of single cpu speed ratio that we see here,
and expect at this point, for codes running near 100% vectorization levels.
If you take the manufactuer's hint of a speed ratio of 2.5 between
the R3000 and the R6000 you get a factor of 29 for the YMP vs the R6000.
Now the ONE data point I have indicates that the ratio between the R3000 and
the R6000 can be as good as 2.7, so I am inclined to believe the manufacturers
estimate which is lower.

I do not know what kind of a deal you fellows got on a Y, but
an 8 processor Y with 32 megawords (thats 32 megabytes per cpu)
cost (system cost, disk drives included) around 3 million per processor.
Yes, we are looking at increasing the size for the memory of the one
here, at a cost I don't care to mention in an open forum.
The single cpu R6000 is going to be between 100K and 200K depending
on whether you go for more memory per cpu, and many gigabytes
of disk.  The bottom line, roughly 30 times the speed for 30 times
the cost for code which is fully vectorized on the Y.  There is an absolute
performance advantage but no cost-performance advantage.  If your
code is not 99% vectorized, however, you are very foolish to run
it on a traditional supercomputer cpu.  As you correctly point out.


>This is all just as excuse to remind Eugene :-) that some users will
>still be able to make effective use of vector supercomputers.  In
I pointed out in my posting that Killer Micros have overrun traditional
supercomputers in scalar performance.  I qualified this very explicitly
in my posting.  The notion that I need to be reminded that traditional
supercomputers are still hanging in there for codes which are nearly 100%
vectorized is silly.

brooks@maddog.llnl.gov, brooks@maddog.uucp