Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!columbia!rutgers!husc6!endor!reiter From: reiter@endor.harvard.edu (Ehud Reiter) Newsgroups: comp.arch Subject: Connection Machine Argument Message-ID: <745@husc6.UUCP> Date: Wed, 3-Dec-86 11:45:22 EST Article-I.D.: husc6.745 Posted: Wed Dec 3 11:45:22 1986 Date-Received: Wed, 3-Dec-86 21:50:17 EST Sender: news@husc6.UUCP Reply-To: reiter@harvard.UUCP (Ehud Reiter) Distribution: na Organization: Aiken Computation Lab, Harvard University Lines: 61 The idea behind the Connection Machine architecture, of having an expensive hypercube interconnect with extremely simple 1-bit SIMD processors at the nodes, has bothered me quite a bit. The below tries to rigorously argue that such an architecture is inappropriate. I would very much welcome any comments on this, either to me personally or to the net. According to J. Ullman in COMPUTATIONAL ASPECTS OF VLSI, chapter 6, the cost of a butterfly network in VLSI is O(n*n). That is, if the entire network is on a chip, the size of the network is a quadratic function of the number of nodes connected. This is a lower bound (ie any butterfly network MUST take this amount of space). A similar result applies to a hypercube, and a shuffle- exchange network is only slightly cheaper at O((n/log(n))**2). This has two implications on the Connection Machine architecture of many 1-bit processors connected in a hypercube, if the goal is to eventually be able to realize such an architecture on a single chip/wafer. First, the dense hypercube interconnect should only be used when necessary, and a mesh interconnect should be used when possible, since it will be much cheaper, in hardware terms. Since most of the applications that it is claimed can run on a C.M. (vision, database, graphics, etc) can be run on a grid system, in the long term they will not be cost-effective to run on a Connection Machine (indeed, they are not cost-effective today, compared to alternative parallel architectures). Second, for applications which absolutely require the dense interconnect, the processors should be made as large as possible and should NOT be simple 1- bit machines. Since the size of the interconnect grows as n squared, it will be by far the largest component of a large system, and if we can replace each m simple processors by 1 complex processor, we will reduce the size of the inter- connect by m*m, and thus the size of the system. For example, suppose I wish to build a complete system on a single wafer, and I have 10,000,000,000 units of silicon. Suppose a simple processor with 1 unit of processing power has an area of 1000, while a complex processor with 10 units of processing power has an area of 100,000. Suppose the hypercube inter- connect requires n*n area. Then, if I use simple processors, the interconnect area limits me to about 100,000 processors, while if I use complex processors, I am limited to about 60,000 processors. In other words, even though each individual simple processor is 10 times more efficient (1 processing unit per 1000 area units) than a complex processor (10 processing units per 100,000 area units), the system built out of complex processors is more powerful (600,000 processing units) than a system built of simple processors (100,000 processing units). This result is not dependent on the numbers in the example, but is general. Given any two processors, one of size SA and power PA, and one of size SB and power PB, for sufficiently large systems I am always better off using the pro- cessor with more power, no matter what the actual size and power ratios are. In summary, if we assume that the continuing progress of microelectronic technology and/or a shift to wafer-scale technology requires complete systems to be fabricated as a single VLSI unit, then the Connection Machine is not a viable architecture. The sometimes made analogy to the brain is misleading, because the brain is a 3-D structure, where the interconnect is much cheaper (n**1.5), and VLSI systems are 2-D structures. Ehud Reiter reiter@harvard (ARPA,UUCP)