Path: utzoo!attcan!uunet!ncrlnk!ncr-sd!hp-sdd!hplabs!ucbvax!pasteur!ames!mailrus!ukma!gatech!hubcap!Donald.Lindsay
From: Donald.Lindsay@K.GP.CS.CMU.EDU
Newsgroups: comp.parallel
Subject: Re: communication delays (was analysis of parallel algorithms)
Message-ID: <3967@hubcap.UUCP>
Date: 23 Dec 88 18:42:43 GMT
Sender: fpst@hubcap.UUCP
Lines: 38
Approved: parallel@hubcap.clemson.edu

In article <3882@hubcap.UUCP> andy@cayuga.stanford.edu (Andy Freeman) writes:
>If one assumes that device size is a constant independent of the total
>number of devices and that the speed of light really is a limit on
>communications speed, one can prove that communication latency must be
>Omega(cube-root(n)).  (Omega is the lower bound, Big-O is the upper bound.)
>If one also assumes standard thermodynamics and that the power-
>used/device/unit-time is a constant, the latency grows to
>Omega(square-root(n)) because an arbitrarily large cube of heaters
>can't be cooled ...

This applies to single processors.  A MIMD machine, of course, can grow
without any reduction in the speed of each node. What suffers is the
communications latency between (physically) nonadjacent nodes.  Some
programs are insensitive to this, and can be scaled to a World Machine
(as factoring has).

Another limit is component reliability.  If we generously assume complete
fault tolerance, then Mean Time Between Failure must equal (Mean Time To
Repair)/(number of repairmen). If MTBF were smaller than MTTR/Nr, then
the repair crew would be falling further and further behind. The
shrinking amount of hardware would have an increasing MTBF, until we
reach equilibrium at MTBF = MTTR/Nr.

Today a system is "large" if it has 10^4 or 10^5 chips. I guess the
Cray-3 design at 10^5 chips and over 10^6 pinouts (16 CPUs in one cubic
foot). I don't believe it can be fixed while running, so the MTBF is an
interesting question.

Kung has argued that MIMD designs must scale communications linearly with
computation (and memory either linearly or as the square, depending on
the algorithms of interest).  So, as clock rates rise, the bandwidth
between cabinets must go up.

Another limit for homogenous MIMD designs is the wiring pattern. The TF-1
design (32K CPUs) has a cylinder of cables, 40 feet across and 63 layers
(6 feet) deep. This pattern has essentially been designed to its limit.

Don		lindsay@k.gp.cs.cmu.edu    CMU Computer Science