Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!uwm.edu!rutgers!mit-eddie!uw-beaver!cornell!ken From: ken@CS.Cornell.EDU (Ken Birman) Newsgroups: comp.sys.isis Subject: Re: looking for real-life examples of isis-groups Keywords: overlapping groups, semantics, causality Message-ID: <1991May13.151207.764@cs.cornell.edu> Date: 13 May 91 15:12:07 GMT References: <1991May12.174040.24185@cs.cornell.edu> Sender: news@cs.cornell.edu (USENET news user) Distribution: comp Organization: Cornell Univ. CS Dept, Ithaca NY 14853 Lines: 39 Nntp-Posting-Host: fafnir.cs.cornell.edu In article <1991May12.174040.24185@cs.cornell.edu> ken@CS.Cornell.EDU (Ken Birman) writes: >For example, two replicated writes might be seen in inconsistent orders >by members of a group or a coordinator-cohort application might simply hang. I got email from two people asking me to "clarify" this point. The problem is that in ISIS, several parts of the system use asynchronous cbcast to update replicated data. In fact, our manual recommends that you do this, too. Say that group G={a,b} and that process 'a' uses this approach. Now, some process p does an operation on G and 'a' sends back an answer. p is not in G. p now does another operation on G and 'b' receives it. It is easy to imagine that 'a' send back information that reflected the update, but if CBCAST is not preserved "outside" group boundaries, 'b' may not have seen that update -- in fact, 'b' may not have seen any of the events in this whole sequence. In general, the reason that ISIS worries about causality "between multiple groups" is not that problems develop in group G' because of something that happened in group G, but rather that if group G is very asynchronous and one interacts with several of its members, they may not have seen critical "past" events, if causality is not preserved. Thus, we need to preserve causality "over group boundaries" to avoid a problem that arises entirely within a single group -- the group got itself into trouble, using asynchronous cbcast, and the group will see the problem, in the form of a race condition. In my mind, the key question is whether asynchronous CBCAST is such a big win over a synchronous protocol, like ABCAST. Obviously, I personally believe the answer is that it is.... and I expect our new system to prove this. Even in the current version of ISIS, asynchronous CBCAST to small numbers of destinations is much faster than ABCAST. So, this gets back to our basic claim: if you use asynchronous CBCAST (or ABCAST, for that matter), problems can arise unless causality is maintained at all times -- inside or outside of the group where you tool this action. -- Kenneth P. Birman E-mail: ken@cs.cornell.edu 4105 Upson Hall, Dept. of Computer Science TEL: 607 255-9199 (office) Cornell University Ithaca, NY 14853 (USA) FAX: 607 255-4428