Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!uwm.edu!rpi!batcomputer!cornell!ken From: ken@gvax.cs.cornell.edu (Ken Birman) Newsgroups: comp.sys.isis Subject: Re: network machines as compute servers Message-ID: <35008@cornell.UUCP> Date: 8 Dec 89 19:53:58 GMT Sender: nobody@cornell.UUCP Reply-To: ken@cs.cornell.edu (Ken Birman) Distribution: comp Organization: Cornell Univ. CS Dept, Ithaca NY Lines: 50 This is a followup on my posting responding to larsa@nada.kth.se (Lars Andersson) >Consider the following: One "master" puts "tasks" in a batch, there to be >picked up by the "slaves" as each completes its current task. When a slave >completes a task it puts the solution in an appropriate batch, there to be >picked up and stored by the master (depending on the problem, one might want >to write directly to a file ...). >Within my limited knowledge, the only system that has something like this >"built in" is ISIS (the NEWS service). However, it's not clear to me that this >is correct or suited for this kind of application, or if it's the only or best >(most portable) solution. Any comments on the above would be appreciated. My prior message ("where lies the future") didn't respond to this more narrow question. Using a process group, we would normally implement this Linda-style of system directly. In our ISIS manual, discussion of this technique can be found in Chapter 8. Essentially, we recommend that one cbcast both requests and slave solution messages to a shared group; if you want to go further and actually make sure that everyone knows who is doing what, a slight embellishment suffices. The latter, with code example, is included in the chapter on replicated data in the recent textbook that Adison Wesley's ACM Press produced from the Arctic 88/Fingerlakes 89 short course. The book is called "Distributed Computing" and is now available; the chapter is also available from us in TR form. The basic idea is the same: multicast the request and multicast what the slaves do. Dave George of Cornell's graphic's group has a neat variation on this in a real application that does scene rendering; he has a single process that farms out work to do and gets a fancier and more efficient, but not fault-tolerant, solution. Unfortunately, he doesn't get this news group can can only be contacted by email (dwg@graphics. cornell.edu) The NEWS service implements this same mechanism and can be used if you prefer its higher level interface. The advantage of NEWS is that it even works if the processes come and go without all being up at the same time. The disadvantage is that this imposes some overhead because the requests get flushed to a disk file that NEWS uses to maintain its persistent state. Let me know if you are still unclear on this and I will be happy to post something more detailed... Ken (TR version of that chapter is called "Exploiting replication" and is available on request from croft@cs.cornell.edu. It assumes that you know something about CBCAST and ABCAST)