Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!purdue!gatech!hubcap!baden From: baden@lbl-csam.arpa (Scott Baden [CSR/Math]) Newsgroups: comp.parallel Subject: Re: Multiprocessing on Suns Message-ID: <6670@hubcap.clemson.edu> Date: 4 Oct 89 13:01:40 GMT Sender: fpst@hubcap.clemson.edu Lines: 317 Approved: parallel@hubcap.clemson.edu I collected quite a few responses to my query on how to connect an ensemble of suns into a multiprocessing team. Here is a summary of what I received. (It took some time for the mail to percolate overseas and back, that is the reason for the delay in my replying.) Some entries are undoubtably incomplete. Corrections and additions are appreciated. Thanks to all those who contributed! Scott B. Baden Lawrence Berkeley Laboratory Berkeley, California baden@csam.lbl.gov ...!ucbvax!csam.lbl.gov!baden I heard about 9 different projects: 1. ISIS (Cornell) 2. Cosmic Environment (Caltech) 3. DOMINO (U. Maryland, College Park) 4. DPUP (U. of Colorado) 5. TORiS (Toronto) 6. LINDA (Yale, Scientific Computing Associates) 7. SR (U. Arizona) 8. MAITRD (U.C. Berkeley/U. Wash) 9. PARMACS (Argonne) 1. ISIS ISIS is billed as "a toolkit for distributed and fault-tolerant programming." It runs on "UNIX on SUN, DEC, GOULD, and HP systems, although ports to other UNIX-like systems are planned for the future." The manual is 300 pages long. If you want ISIS, send mail to "croft@gvax.cs.cornell.edu," subject "I want ISIS". 2. COSMIC ENVIRONMENT "The Cosmic Environment (CE) is a generic message-passing multicomputer control environment ... The goal of CE is to provide a simple and uniform interface for multicomputers and to allow for the writing of truly portable application programs. CE is a distributed environment directly accessible from any UNIX machine connected to the same TCP/IP network." "The CE currently supports the following machines: 1) iPSC/1 2) iPSC/2 3) Symult (formerly Ametek) 2010 4) The Cosmic Cube 5) A set of NFS connected work stations pretending to be a real concurrent machine. Most people use SUN work stations. People around here call such a cube a "ghost cube"." You can obtain a programming guide by sending e-mail to chuck@vlsi.caltech.edu, or postal mail to: Charles L. Seitz CS 256-80, Caltech Pasadena, Ca 91125 3. DOMINO DOMINO is a message passing environment for parallel computation. See the Computer Science Dept. (U. Maryland) tech report # TR-1648 (April, 1986) by D. P. O'Leary, G. W. Stewart, and R. A. van de Geijn. I quote: "DOMINO is a set of C-language routines with a short assembly language interface that allows multiple tasks to communicate and schedule local tasks for execution. These tasks may be on a single processor or spread among multiple processors connected by a message-passing network." You can get a copy of domino from netlib; to get instructions send mail to one of: na.netlib@na-net.stanford.edu netlib netlib@research.att.com with subject "send index for domino" (or you can put this in the message-body.) The members of the DOMINO project can be reached through ARPANET or NANET at the following addresses. oleary@mimsy.umd.edu na.oleary@su-score.arpa stewart@mimsy.umd.edu na.pstewart@su-score.arpa rvdg@mimsy.umd.edu 4. DPUP DPUP stands for Distributed Processing Utilities Package. What follows is an abstract from a technical report written at the Computer Science Dept. at the University of Colorado by T. J. Garner, et. al "DPUP is a library of utilities that support distributed concurrent computing on a local area network of computers. The library is built upon the interprocess communication facilities in Berkeley Unix 4.2BSD." 5. TORiS TORis implements a shared memory communication model. Contact Orran Krieger at the University of Toronto for more information: UUCP: {decvax,ihnp4,linus,utzoo,uw-beaver}!utcsri!eecg!okrieg ARPA: okrieg%eecg.toronto.edu@relay.cs.net CSNET: okrieg@eecg.toronto.edu CDNNET: okrieg@eecg.toronto.cdn 6. LINDA Linda is a parallel programming language for shared memory implementations. It is simple and has only six operators. C-linda has been implemented for a network of SUNs in the internet domain. With LAN-LINDA (also called TSnet) you can write parallel or distributed programs in C and run them on a network of workstations. TSnet has been tested on Sun and IBM RT workstations. Contact David Gelernter (project head) or Mauricio Arango at: gelernter@cs.yale.edu arango@cs.yale.edu TSnet and other Linda systems are being distributed through Scientific Computing Associates. Contact Dennis Philbin Scientific Computing Associates 246 Church St., Suite 307 New Haven, CT 06510 203-777-7442 7. SR I quote: "SR (Synchronizing Resources) is designed for writing distributed programs. The main language constructs are resources and operations. Resources encapsulate processes and variables they share; operations provide the primary mechanism for process interaction. SR provides a novel integration of the mechanisms for invoking and servicing operations. Consequently, all of local and remote procedure call, rendezvous, message passing, dynamic process creation, multicast, and semaphores are supported. An overview of the language and implementation appeared in the January, 1988, issue of TOPLAS (ACM Transactions on Programming Languages and Systems 10,1, 51-86). SR runs on various machines including (among others): Vax, Sun, NeXT, and Multimax Encore. "An SR program runs on one or more networked machines of the same architecture. "SR is available by anonymous FTP from Arizona.EDU (128.196.128.118 or 192.12.69.1). [Copy over the README file for an explanation.] You may reach the members of the SR project electronically at: uunet!Arizona!sr-project or by surface mail at: SR Project Department of Computer Science University of Arizona Tucson, AZ 85721 (602) 621-2018 8. MAITRD "The maitr'd software is remote process server that is designed to farm out cpu expensive jobs to less loaded machines. It has a small amount of built-in intelligence, in that it attempts to send jobs to the least loaded machine of the set which is accepting off-site jobs." `Maitrd' is available via anonymous ftp from june.cs.washington.edu (128.95.1.4) as ~ftp/pub/Maitrd.tar.Z. There is also a heterogeneous systems rpc package `hrpc.tar.Z'. Contact Brian Bershad at U. Washington (brian@june.cs.washington.edu.) for more information. A paper showed up in a Usenix newsletter in early 1986: "Load Balancing With Maitrd" (This is also a U.C. Berkeley C.S. Division Technical report) 9. PARMACS David Levine at Argonne National Laboratory tells us about a "generic package to do send/recv message passing" with "different versions (c, c++, fortran) [that] work on different machines." For more information, send email to netlib@mcs.anl.gov, with subject (or body) ``send index from parmacs.'' For more information send email to levine@mcs.anl.gov or by uucp: {alliant,sequent,rogue}!anlams!levine. 10. OTHER REFERENCES ======================================================================== Tom Slezak (slezak@lll-lcc.llnl.gov) wrote an article called "Quick and dirty parallel processing on a network of workstations." ======================================================================== Bart Miller at U. Wisconsin, Madison (bart@cs.wisc.edu) has a package written "several years ago for connecting processes together to send messages.... The processes can be on the same machine, or different machines (it doesn't matter)." ======================================================================== Try the book "Portable Programs for Parallel Processors" by James Boyle, Ralph Butler, Terrence Disz, Barnett Glickfeld, Ewing Lusk, Ross Overbeek, James Patterson, and Rick Stevens; ISBN 0-03-014153-2. >From the preface: "This book describes a set of tools [written in C] that were developed at Argonne National Laboratory to enable us to explore issues of program performance and portability on a fairly broad range of parallel machines." ... "The original set of tools was developed only for shared-memory machines. Later, we added the tools to support message passing among machines that do not have shared memory." ======================================================================== At New Mexico State University there is "a TCP Remote Message Passing Service which runs on [a] network of Suns." A daemon is used to control message passing activity. This software has been in use for "a couple of years." Contact: Cari Soderlund Computing Research Laboratory New Mexico State University Box 30001 Las Cruces, NM 88003 cari@nmsu.edu ======================================================================== R. Kannan in the Concurrent Engineering Research Center at West Virginia University (kannan@cerc.wvu.wvnet.edu) reports on the development of a tool to handle message passing activity in a network transparent fashion. He also mentions a shared memory system called "User Level Shared variables," and gives us a contact: Don Libes Factory Automation Systems Division NIST (previously NBS) Gaithersburg, MD 20899 ========================================================= DPSK, DISTRIBUTED PROBLEM SOLVER KERNEL contact Gregg Podnar (gwp@edrc.cmu.edu) at the Engineering Design Research Center at CMU ======================================================================== simon%castle.edinburgh.ac.uk@NSFnet-Relay.AC.UK of Meiko Scientific Ltd in Edinburgh, Scotland replies: "Meiko's CS-Tools package handles message passing in a hetergeneous network of Suns and Meiko transputer boxes. It's generally used as a vehicle for accessing transputer power, but can run simply on a sun network. Meiko can be contacted in North America at: Meiko Scientific Reservoir Place, 1601 Trapelo Road Waltham MA 02154 Phone: 617 890 7676 ======================================================================== Kevin Hammond at the Univ. of Glasgow (kh%cs.glasgow.ac.uk@NSFnet-Relay.AC.UK) has some examples of socket/RPC code that could help novices. (Kevin reports some difficulties with the Sun documentation.)