Path: utzoo!attcan!uunet!mailrus!cornell!ken From: ken@gvax.cs.cornell.edu (Ken Birman) Newsgroups: comp.sys.isis Subject: Re: How to find rank of server w.r.t. bcast messages Message-ID: <44174@cornell.UUCP> Date: 6 Aug 90 14:03:57 GMT References: <219@newhope.water.ca.gov> Sender: nobody@cornell.UUCP Reply-To: ken@gvax.cs.cornell.edu (Ken Birman) Organization: Cornell Univ. CS Dept, Ithaca NY Lines: 87 In article <219@newhope.water.ca.gov> rfinch@caldwr.water.ca.gov (Ralph Finch) writes: > >We wish to more or less use the parallel computing concept as outlined >in the Isis manual (the fft example), but instead of rank in the group >we wish to use rank according to the order the servers receive the >bcast message. Is this possible in an easy fashion (Isis call)? Robert Cooper points out to me that Ralph probably had something else in mind... the second interpretation is that he wants to program a group of servers pretty much in the style that the Linda system supports: bag of tasks, you add a task by multicasting your request, the first free server removes the task and executes it. This is fairly easy to support. You do the following: 1) to add a task to the "bag" a client does an abcast. All servers receive the task request in the same order and all save it into the bag. 2) when idle, a server abcasts a message "I'm idle" and awaits 1 reply 3) on receipt of an idle message, all server handle it this way: 3.a) if there is a task in the bag, they remove it. In the server that sent the "I'm idle" message a reply() is done, causing the computation to begin. (Other servers don't need to send replies, obviously). 3.b) if the bag is empty, the message is added to a queue 4) When a request is added the bag (1) and the idle-server queue is non-empty (3.b), then algorithm (3.a) is executed for the first "I'm idle" message on the queue. The code is something like this: snd_req: abcast(group, WORK_REQ, ...., 1, "answ", ....) get_req(mp) qu_append(bag, mp) if(qu_head(idle_servers)) { idlemsg = de_queue(idle_servers); server = msg)getsender(idlemsg); if(addr_ismine(server)) reply(idlemsg, "%d", msg_getid(mp)) } snd_idle: abcast(group, IM_IDLE, ...., 1, "%d", &msg_id_to_process); get_idle(mp) if(qu_head(bag)) { reqmp = de_queue(bag); server = msg_getsender(mp); if(addr_ismine(server)) reply(mp, "%d", msg_getid(reqmp)) } else qu_append(idle_servers, mp); This assumes that idle_servers and bag are queues of messages (don't forget to call msg_increfcount when queueing one and msg_delete when done with it!) and that the operations to add a message are qu_append and to dequeue (but not delete) one is de_queue. Notes: 1) This will be a little slower than a Linda implementation because it uses multicasts, but unlike the Linda version the data structures are replicated and hence the scheme is much more fault-tolerant. Need to think about whether you need this. 2) Could substitute cbcast for abcast with a token-based order publication scheme. Save this until you have reason to think your bottleneck is the use of abcast instead of cbcast. 3) Could use a coord-cohort computation to actually process messages. This is slightly tricky to do right. You would need to make a private copy of the routine "coordinator" (see clib/cl_coord.c) and modify it to let you pick the first coord. This will be the server picked in step (3.a) To use your routine, on receipt of a request the servers would a) add it to the bag, as above b) call coord_cohort_l(...., your-coordinator-choser); The choser would block until the "first coordinator" is known to it, using a condition variable; when the coordinator-choice arrives use t_sig to pass the news over to it. Then it would do what coordinator() now does, but starting from this idle server as the "prefered coordinator" If I were just tossing something together I would go with abcast's but seriously consider the coordinator choice issue. The code to do this would be pretty simple and the benefit is that you could use a coordinator-cohort scheme biased to start with the server who is idle first for each request. I can post a sample of code for this if people find it too confusing... Ken