Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!pp!yoda!tomlic From: tomlic@yoda.ACA.MCC.COM (Chris Tomlinson) Newsgroups: comp.arch Subject: Re: parallel systems Message-ID: <358@yoda.ACA.MCC.COM> Date: 19 Oct 89 13:57:32 GMT References: <20416@princeton.Princeton.EDU> Organization: MCC, Austin, TX Lines: 65 From article <20416@princeton.Princeton.EDU>, by mg@notecnirp.Princeton.EDU (Michael Golan): > In article <7651@bunny.GTE.COM> hhd0@GTE.COM (Horace Dediu) writes: >> >>Consider the 8k processor NCUBE 2--"The World's Fastest Computer." >>(yes, one of those). According to their literature: >>"8,192 64 bit processors each equivalent to one VAX 780. It delivers >>60 billion instructions per second, 27 billion scalar FLOPS, exceeding the > > This imply a VAX 780 is a 7 mips machine ? The architecture of the processor is similar to the VAX ISA, not the performance. > >>performance of any other currently available or recently announced >>supercomputer." It's distributed memory .5MB per processor, runs UNIX, > ^^^^^^^^^^^^^^^^^^^^^^ >>and is a hypercube. > > .5MB ? And this is faster than a Cray? How many problems you can't even I understand that NCUBE makes provisions for up to 64MB per node on those systems using the 64 bit processors. They also apparently have incorporated a through-routing capability in the processors similar to that found on the Symult mesh-connected machines. > solve on this? And for how many, a 32Mb single VAX 780 will beat ?! > One of the well known problems wtih Hypercubes is that if you look at a job > that uses the whole memory (in this case 4Gb = Big Cray), a single machine > with the same performance of one processor (and all memory) will be almost > as good and sometimes even better. The current trends in distributed memory MIMD machines are towards very low communication latencies by comparison with the first generation machines that used software routing and slow communication hardware. This has a tendency to drive the machines more towards shared-memory like access times, but of course physical limitations simply mean that DM-MIMD machines are a scalable way of approximating shared-memory worse and worse as the machine gets larger, but at least the machine can get larger. > > My original point was that MIMD, unless it has shared memory, is very hard > to make use of with typical software/algorithms. Some problems can be solved > nicely on a Hypercube, but most of them can not! And the state of the art The state-of-the-art in parallel algorithm development is advancing rapidly as machines become available to experiment on. It is more of an issue of algorithm design than paralyzing sequential codes. There are quite a number of problems that are tackled on Crays because of superior scalar performance that do not make significant use of the SIMD vector capabilities. I would point to the development of BLAS-2 and -3 as indications that even on current supercomputers compiler technology just doesn't carry the day by itself. > in compilers, while having some luck with vectorized code, and less luck > with shared memory code, has almost no luck with message-passing machines. > > > Michael Golan > mg@princeton.edu > My opinions are my own. You are welcome not to like them. Chris Tomlinson tomlic@MCC.COM --opinions....