Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uflorida!stat!stat.fsu.edu!mccalpin From: mccalpin@masig3.ocean.fsu.edu (John D. McCalpin) Newsgroups: comp.arch Subject: Re: ATTACK OF KILLER MICROS Message-ID: Date: 16 Oct 89 18:16:56 GMT References: <35825@lll-winken.LLNL.GOV> <1081@m3.mfci.UUCP> <35896@lll-winken.LLNL.GOV> Sender: news@stat.fsu.edu Organization: Supercomputer Computations Research Institute Lines: 54 In-reply-to: brooks@vette.llnl.gov's message of 15 Oct 89 18:20:48 GMT In article <35896@lll-winken.LLNL.GOV> brooks@maddog.llnl.gov (Eugene Brooks) writes: >Microprocessor development is not ignoring vectorizable workloads. The >latest have fully pipeline floating point and are capable of pipelining >several memory accesses. As I noted, interleaving directly on the memory >chip is trivial and memory chip makers will do it soon. [ ... more > stuff deleted ... ] > They will do this with their commodity parts. It is not at all clear to me that the memory bandwidth required for running vector codes is going to be developed in commodity parts. To be specific, a single 64-bit vector pipe requires a sustained bandwidth of 24 bytes per clock cycle. Is an ordinary, garden-variety commodity microprocessor going to be able to use 6 32-bit words-per-cycle of memory bandwidth on non-vectorized code? If not, then there is a strong financial incentive not to include that excess bandwidth in commodity products.... In addition, the engineering/cost trade-off between memory bandwidth and memory latency will continue to exist for the "KILLER MICROS" as it does for the current generation of supercomputers. Some users will be willing to sacrifice latency for bandwidth, and others will be willing to do the opposite. Economies of scale will not eliminate this trade-off, except perhaps by eliminating the companies that take the less profitable position (e.g. ETA). >Supercomputers of the future will be scalable multiprocessors made of >many hundreds to thousands of commodity microprocessors. They will >be commodity parts because these parts will be the fastest around and >they will be cheap. It seems to me that the experience in the industry is that general-purpose processors are not usually very effective in parallel-processing applications. There is certainly no guarantee that the uniprocessors which are successful in the market will be well-suited to the parallel supercomputer market -- which is not likely to be a big enough market segment to have any control over what processors are built.... The larger chip vendors are paying more attention to parallelism now, but it appears to be in the context of 2-4 processor parallelism. It is not likely to be possible to make these chips work together in configurations of 1000's with the application of "glue" chips.... This is not to mention the fact that software technology for these parallel supercomputers is depressingly immature. I think traditional moderately parallel machines (e.g. Cray Y/MP-8) will be able to handle existing scientific workloads better than 1000-processor parallel machines for quite some time.... -- John D. McCalpin - mccalpin@masig1.ocean.fsu.edu mccalpin@scri1.scri.fsu.edu mccalpin@delocn.udel.edu