Xref: utzoo comp.arch:21864 comp.parallel:2387 Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!emory!hubcap!fpst From: mccalpin@perelandra.cms.udel.edu (John D. McCalpin) Newsgroups: comp.arch,comp.parallel Subject: Networking for Distributed Computing Message-ID: <1991Apr5.182853.20728@hubcap.clemson.edu> Date: 5 Apr 91 14:39:33 GMT Sender: usenet@ee.udel.edu Followup-To: comp.arch Organization: College of Marine Studies, U. Del. Lines: 53 Approved: parallel@hubcap.clemson.edu Nntp-Posting-Host: perelandra.cms.udel.edu Source-Info: From (or Sender) name not authenticated. The recent introduction of the HP "Snakes" computer systems underscores a critical characteristic of modern scientific computing, that is that the rules of the game change *very* quickly. It is easy to convince one's self that the most cost-effective computing environment over the long term (which in these days means "a few years") is a heterogeneous distributed network, with low-cost hardware that is updated incrementally. It is vital in this case to use open systems and off-the-shelf technology. Unfortunately, the most commonly available networking option (ethernet) uses a broadcast approach, which is definitely sub-optimal for the communications needs of many "natural" parallel distributed algorithms. In talking to my IBM salescritter about this topic, he suggested using an additional SCSI interface as the networking option. I think that this is a potentially important idea. It uses off-the shelf hardware, and only requires the writing of some (fairly simple?) SCSI device drivers. The system I envision would consist of some moderate number of high-performance cheap workstations (IBM RS/6000-320 or HP/9000-720), lets take 32 as an example. An additional SCSI interface on each unit would provide 7 high-speed, point-to-point interfaces, which could be reconfigured (via a patch panel) to provide a large number of different topologies, including: a line, a 2-D mesh, a 3-D mesh, a ring, etc. I believe that it would be easy to modify a Linda-like software system to operate efficiently on such a system. Simply map particular named tuple-spaces into particular device drivers so that point-to-point communications may be executed directly. A "default" or un-named tuple space could still be handled in a global fashion using the broadcast network. It might be used for things like global accumulations and synchronization messages. A suitably designed code (for example a 3-D spectral element code for fluid dynamics using explicit time marching techniques) should be capable of 1 GFLOPS performance on a network of 32 IBM RS/6000-320's. With current university discounts and third-party memory, such a system (with 32 MB RAM per node) could be assembled for less than $500,000. Apparently, the HP/9000-720 would be capable of slightly higher performance at a similar cost (though I don't know about 3rd-party memory for the HP's). So what is wrong with this idea? Am I misunderstanding what can be done with a SCSI interface or how hard the implementation of a buffered FIFO would be? -- John D. McCalpin mccalpin@perelandra.cms.udel.edu Assistant Professor mccalpin@brahms.udel.edu College of Marine Studies, U. Del. J.MCCALPIN/OMNET -- =========================== MODERATOR ============================== Steve Stevenson {steve,fpst}@hubcap.clemson.edu Department of Computer Science, comp.parallel Clemson University, Clemson, SC 29634-1906 (803)656-5880.mabell