Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cornell!ken From: ken@gvax.cs.cornell.edu (Ken Birman) Newsgroups: comp.sys.isis Subject: Re: ISIS "homework" problem Message-ID: <35557@cornell.UUCP> Date: 29 Dec 89 01:57:30 GMT References: <1989Dec28.173847.11878@squazmo.solbourne.com> Sender: nobody@cornell.UUCP Reply-To: ken@gvax.cs.cornell.edu (Ken Birman) Organization: Cornell Univ. CS Dept, Ithaca NY Lines: 115 (As you have probably figured out, Stan and I were discussing the "technology requirements" for systems built using gigabit lines and other hot hardware... I basically argued that this push to greater speed is creating more of a need for ISIS-like function; Stan, as you will have gathered, is more of a point-to-point person and hence skeptical of the need for ISIS group-style complexity...) In article <1989Dec28.173847.11878@squazmo.solbourne.com> stan@squazmo.solbourne.com (Stan Hanks) writes: >... >>S And with the real high speed networks coming soon, I expect that we're >>S going to find ourselves looking for a model which lets us treat all >>S IPC as memory accesses (sort of like the CMU/IBM MEMNET stuff) but >>S in a manner that really works. I really expect the point-to-point >>S data reliability to happen at the hardware level exclusively by sometime >>S in the early '90s. >K> >K>This, I don't buy. Problem is that you are concealing the "fact" of >K>physical distribution, which many applications need to know about. >K>For example, your view rules out a large class of control applications >K>that need to know about "local" (==realtime response) and remote (slow >K>but knowledgable). > >S>Yeah, but there are also a much broader class of problem that neither >S>know nor care about locality issues. Like most of the application programs >S>that people use. And almost all commercial applications. I've always >S>viewed control, real-time, and mission-critical fault tolerant applications >S>as basicly "special" -- we should consider them when designing things, but >S>we should design special things to accommodate them rather than fitting >S>accommodations for them into more general purpose things. I guess I buy this for some applications but I think you are arguing an untenable point: namely, that there really isn't anything in a distributed system (now or anytime soon) that needs to be "controlled". If you equate control with, say, factory floor control, sure, there is a lot of commercial stuff that doesn't need much controlling. But, there is a larger and larger collection of stand-alone services out there that need to control themselves and be highly available. E.g., your average commercial outpost in Houston selling access to a proprietary database on Texas geophysics or whatever. This system may well be spread over many nodes and will want high availability. And, it needs to control load to avoid trashing just because a few too many queries came in at once. I view this as a distributed control problem, too. And, I think that existing technology hasn't given us much of a handle on designing these kinds of self-maintaining servers or systems. So, I see ISIS as providing the "glue" that holds together a system that might well offer its clients a very vanilla RPC interface... >K>Also, this approach is very weak for fault-tolerant applications. >K>Its easy to recover when an RPC fails; hard to deal with a chunk of >K>memory suddenly getting unmapped. > >S>True, but we manage to handle page faults today -- I view this as sort >S>of a "network page fault". > >K>>My feeling is that the interesting applications would rather have >K>>powerful but visible tools... > >K>Depends on where you draw the "interesting" line -- my primary interest >K>has usually been trying to gain maximal use of network resources for >K>"traditional" computing. If we start looking specificly at real-time >K>and the like, then yes, you're right. I don't buy the "fault tolerance is just a page fault problem" line; I see little evidence that anyone has come up with systems able to reconfigure this gracefully. Page faults are really easy to deal with -- just fetch the page. Failures are more of a mess: you may need to clean up, restart programs, reattach programs to the new servers, etc... This is why we tend to favor services that have 2 or 3 processes cooperating and where you expect a reply from "anyone" and not some specific process... >>S> And you're right: scalability is a growing concern.... Well, glad we agree on something! >S>I get asked about [multiprocessors] all the time. We have folks who >S>really want to put MACH or V up on our box... >S>I'll be interested to hear what sort of responses people have to all this. >S>And, of course, real interested to see how ISIS works on one of our >S>multiprocessors. > >S>BTW, do you have ISIS for MACH yet? For what I'm looking at, it would >S>give finer granularity than using OS/MP (our regular multiprocessor >S>version of SunOS). ISIS seems fine on MACH. I'm planning to test it under the forthcoming Mt. Xinu MACH release next week, so it should be up and solid on their Beta tape. This will be ISIS V1.3.1, but V2.0 will also get checked out on their system and will be available both from Cornell and, later, on Mt. Xinu's 2.1 release when that occurs. Since MACH and OSF have lately become engaged, a few people asked what came of the ISIS submission under the OSF DE RFT. (How's that for acronyms?) Basically, OSF has ended up focusing on a lower level of the environment -- things like clock and name servers, RPC data encoding, and the file system. OSF seems to have decided to defer a decision on how (if) to include ISIS in their world until after these urgent short-term questions are settled. They did this by putting ISIS into a technology category for submissions of possible interest to them (so they won't say "no") but inappropriate for the DE part of OSF/2 (so they won't say "yes"). However, if OSF/2 is really MACH based, ISIS should be able to run on it. And, I don't expect that OSF/2 will offer some competing technology -- I know enough about the RFT technology submissions to say that ISIS is aimed in a very different direction. For example, at least half a dozen submissions were concerned with linking UNIX to PC's running OS/2....