Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uwm.edu!uakari.primate.wisc.edu!xanth!lll-winken!vette!brooks From: brooks@vette.llnl.gov (Eugene Brooks) Newsgroups: comp.arch Subject: Re: ATTACK OF KILLER MICROS Message-ID: <36249@lll-winken.LLNL.GOV> Date: 19 Oct 89 21:51:00 GMT References: <35825@lll-winken.LLNL.GOV> <1081@m3.mfci.UUCP> <35896@lll-winken.LLNL.GOV> <33798@ames.arc.nasa.gov> <35977@lll-winken.LLNL.GOV> <220@dg.dg.com> Sender: usenet@lll-winken.LLNL.GOV Reply-To: brooks@maddog.llnl.gov (Eugene Brooks) Organization: Lawrence Livermore National Laboratory Lines: 31 In article <220@dg.dg.com> chris@dg.dg.com (Chris Moriondo) writes: >The only really scalable interconnect schemes of which I am aware are >multistage interconnects which grow (N log N) as you linearly increase the >numbers of processors and memories. So in the limit the machine is essentially >ALL INTERCONNECT NETWORK, which obviously costs more than the processors and >memories. (Maybe this is what SUN means when they say "The Network IS the >computer"? :-) How do you build a shared-memory multi where the cost of the >interconnect scales linearly? Obviously I am discounting busses, which don't >scale well past very small numbers of processors. The cost of the interconnect can't be made to scale linearly. You can only get a log N scaling per processor. The key is the base of the log and not having N too large, ie using a KILLER MICRO and not a pipsqueak. Eight by eight switchnodes are practical at this point, with four by four being abslolutely easy. Pin count is the main problem, not silicon area. Assuming 8x8 nodes, a 512 node system takes three stages, a 4096 node system takes 4 stages. Are 4 switch chips cheaper, or equivalent in cost to a killer micro and 32 meg of memory? SUNS "The network is the computer" is meant for ethernet types of things but it really does apply to multiprocessors. If you don't have real good communcation capability between the computing nodes what you can do with the machine is limited. Could anyone handle a KILLER MICRO powered system with 4096 nodes? Just think, 4096 times the power of a YMP for scalar but MIMD parallel codes. ~400 times the power of a YMP cpu for vectorized and MIMD parallel codes. It boggles the mind. brooks@maddog.llnl.gov, brooks@maddog.uucp