Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!zaphod.mps.ohio-state.edu!van-bc!ubc-cs!uw-beaver!cornell!ken
From: ken@gvax.cs.cornell.edu (Ken Birman)
Newsgroups: comp.sys.isis
Subject: Re: Fast causal multicast
Message-ID: <48418@cornell.UUCP>
Date: 15 Nov 90 15:07:08 GMT
References: <48414@cornell.UUCP>
Sender: nobody@cornell.UUCP
Reply-To: ken@gvax.cs.cornell.edu (Ken Birman)
Distribution: comp
Organization: Cornell Univ. CS Dept, Ithaca NY
Lines: 79

In article <48414@cornell.UUCP> scott@amethyst.omg.ORG (Scott Meyer) writes:
> [...]  To reconsider your database example, an
>implementation allowing a greater degree of synchrony, would have
>the database server send back a ticket associated with the request.
>This ticket would then be passed to other processes that might
>depend on the associated update having taken place.

Well, this approach is feasible and the MIT group (Ladin/Liskov/Shrira)
advocates exactly what you propose here.  But, if you don't realize that
the subsystem you called uses a distribituted implementation, you certainly
won't realize that you are supposed to obtain and carry these tickets around.
So, I think this "exposes" an aspect of the implementation.

>... the question is how to introduce just enough synchrony to make a
>fundamentally asynchronous system easy to reason about.  Excessive
>synchrony is a fundamental performance problem (Von Neuman bottleneck).

I think that here we are pretty close to agreement.

>I think it's fair to say that ISIS users are as a whole considerably
>more knowledgable about concurrent programming than are, say, PC
>applications developers or MIS programmers (the people who use the
>software that Netwise sells).....

Well, the idea in ISIS is that we provide toolkits (and utility programs,
like the network resource manager that ISIS Distributed Systems will be
selling) that pretty much can be used "stand alone".  So, naive programmers
wouldn't need to learn about how ISIS works to use them.  Currently, my
feeling is that ISIS is fairly useful this way, but that it looks a bit
complex because so many different tools and utilities are all discussed in
the same tutorial.  Split into pieces, I think a system like this could
be quite approachable by, say, a high-school educated hacker.  

But, this definitely implies that users will be combining canned subsystems
and that the subsystem can't trust the user to lug some sort of ticket
around with them.

Don't you see it as contradictory to expect users to be totally naive
but also expect them to corectly manage the ticket sent back by the
database?  In the MIT scheme (and the ISIS one), these tickets are
not such small objects -- and, the best chance for keeping the size down
is to use fairly complex compression algorithms, which is the last thing
a typical user could be asked to do...

So, as I see it, you end up with system support for maintaining causality
either way -- precisely because you don't expect your users to be
sophisticated enough about system internals to do this for you.


... too bad we don't have comp.sys.isis readers from the MIT project,
because it would be interesting to have their perspective on this.
They do have some papers on their approach, which is the one you are
advocating to -- they call it "lazy replication".  The most readily
available is in the procedings of a conference call PODC 1990 (Principles
of Distributed Computing).  One thing that stands out is that the system
model that this group uses is fundamentally different than the ISIS model.
First, they really are not working with large numbers of groups, while
ISIS is moving more and more towards applications with huge numbers of
groups.  Also, they don't allow clients to broadcast directly to servers;
only a server can initiate broadcasts.  Finally, the handling of failures
is a little different.  On the other hand, if you compare our scheme with
theirs in the case of a single group, the two are very similar.  In fact,
both derive from the same basic idea (vector times, which were developed
a long time ago, perhaps by Jerry Popek and perhaps by Keith Marzullo and
Susan Owiki).  We had toyed with the idea of using this scheme, especially
after the MIT group reported good results with it. But, ISIS uses large
numbers of overlapping groups and this forced us to do a lot of work to come
up with a version that works in ISIS.  

The full details of our scheme will be in Pat Stephenson's thesis, which
should be available in TR form by the end of this year (from us).  The MIT
work is reported both in Rivka Ladin's thesis and in the PODC paper.

>... when you implement this (ISIS V2.0)?

Actually, the implementation is out as part of ISIS V2.1 now.  But, it
lacks support for "bypass" communication when a process is a client of a
process group; this extension (an important one for many of our large
users) will be in ISIS V3.0, which is due out "soon".