Path: utzoo!attcan!uunet!ncrlnk!ncr-sd!hp-sdd!hplabs!ucbvax!pasteur!ames!mailrus!ukma!gatech!hubcap!Donald.Lindsay From: Donald.Lindsay@K.GP.CS.CMU.EDU Newsgroups: comp.parallel Subject: Re: communication delays (was analysis of parallel algorithms) Message-ID: <3967@hubcap.UUCP> Date: 23 Dec 88 18:42:43 GMT Sender: fpst@hubcap.UUCP Lines: 38 Approved: parallel@hubcap.clemson.edu In article <3882@hubcap.UUCP> andy@cayuga.stanford.edu (Andy Freeman) writes: >If one assumes that device size is a constant independent of the total >number of devices and that the speed of light really is a limit on >communications speed, one can prove that communication latency must be >Omega(cube-root(n)). (Omega is the lower bound, Big-O is the upper bound.) >If one also assumes standard thermodynamics and that the power- >used/device/unit-time is a constant, the latency grows to >Omega(square-root(n)) because an arbitrarily large cube of heaters >can't be cooled ... This applies to single processors. A MIMD machine, of course, can grow without any reduction in the speed of each node. What suffers is the communications latency between (physically) nonadjacent nodes. Some programs are insensitive to this, and can be scaled to a World Machine (as factoring has). Another limit is component reliability. If we generously assume complete fault tolerance, then Mean Time Between Failure must equal (Mean Time To Repair)/(number of repairmen). If MTBF were smaller than MTTR/Nr, then the repair crew would be falling further and further behind. The shrinking amount of hardware would have an increasing MTBF, until we reach equilibrium at MTBF = MTTR/Nr. Today a system is "large" if it has 10^4 or 10^5 chips. I guess the Cray-3 design at 10^5 chips and over 10^6 pinouts (16 CPUs in one cubic foot). I don't believe it can be fixed while running, so the MTBF is an interesting question. Kung has argued that MIMD designs must scale communications linearly with computation (and memory either linearly or as the square, depending on the algorithms of interest). So, as clock rates rise, the bandwidth between cabinets must go up. Another limit for homogenous MIMD designs is the wiring pattern. The TF-1 design (32K CPUs) has a cylinder of cables, 40 feet across and 63 layers (6 feet) deep. This pattern has essentially been designed to its limit. Don lindsay@k.gp.cs.cmu.edu CMU Computer Science