Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!cs.utexas.edu!uunet!stan!squazmo!stan
From: stan@squazmo.solbourne.com (Stan Hanks)
Newsgroups: comp.sys.isis
Subject: Re: ISIS "homework" problem
Message-ID: <1989Dec28.173847.11878@squazmo.solbourne.com>
Date: 28 Dec 89 17:38:47 GMT
Reply-To: stan@squazmo.solbourne.com (Stan Hanks)
Organization: Solbourne Computer Inc., Houston NSE Outpost and Sales-Slug Haven
Lines: 141

>> Hmmm. Interesting. My perception is that we have two basic hurdles to 
>> overcome in the '90s: effective use of high speed networks, and the
>> fact that those of us (me included) who thought the future looked like
>> lots of machines coupled by message passing networks were partly 
>> wrong. It's starting to look like (to me, anyway -- and you can probably
>> find others who agree) that we're going to see a partitioning of the
>> classes of computer available Real Soon. On the one hand, we're gonna
>> have small cheap desktops a la the SPARCstation but cheaper (we'll see
>> 20 MIP systems with 16 MB memory and 300 MB disk under $7k by the end
>> of 1990 -- who knows what '91 looks like!). On the other hand, we're
>> gonna have real workstations, which will be shared memory multiprocessor
>> boxes, likely to be running some MACH, SunOS, or SystemV.4 varient.
>
>Yes, I'll buy this...  although I would add "massive servers" to the
>picture (lots of them).

Oh yeah.... Gotta have those. And the things coming down the pike
from all the major vendors that are looking better and better.  Plus 
stuff like the Legato NFS accelerator, and some of the RAID technology....

>> And with the real high speed networks coming soon, I expect that we're
>> going to find ourselves looking for a model which lets us treat all
>> IPC as memory accesses (sort of like the CMU/IBM MEMNET stuff) but
>> in a manner that really works. I really expect the point-to-point
>> data reliability to happen at the hardware level exclusively by sometime
>> in the early '90s.
>
>This, I don't buy.  Problem is that you are concealing the "fact" of
>physical distribution, which many applications need to know about.
>For example, your view rules out a large class of control applications
>that need to know about "local" (==realtime response) and remote (slow
>but knowledgable).  

Yeah, but there are also a much broader class of problem that neither
know nor care about locality issues. Like most of the application programs
that people use. And almost all commercial applications. I've always
viewed control, real-time, and mission-critical fault tolerant applications
as basicly "special" -- we should consider them when designing things, but
we should design special things to accommodate them rather than fitting
accommodations for them into more general purpose things.

>Also, this approach is very weak for fault-tolerant applications.
>Its easy to recover when an RPC fails; hard to deal with a chunk of
>memory suddenly getting unmapped.

True, but we manage to handle page faults today -- I view this as sort
of a "network page fault".

>My feeling is that the interesting applications would rather have
>powerful but visible tools...

Depends on where you draw the "interesting" line -- my primary interest
has usually been trying to gain maximal use of network resources for
"traditional" computing. If we start looking specificly at real-time
and the like, then yes, you're right.

>(You might want to post this whole mail, plus your response; could
>make quite an interesting comp.sys.isis discussion if anyone follows
>up on it!)

Challenge accepted!

>> And you're right: scalability is a growing concern. As is operations over
>> distant and slow networks. The ISIS view of the world as computational
>> nodes connected by networks does real well for small numbers of nodes
>> connected by local networks; maybe some sort of paradigm that lets you
>> lump nodes into a meta-node (i.e. site? lab? etc.) connected by slower
>> networks would work to get you over that hurdle? Hmmmm. Note also that
>> if you take this sort of view, you can accommodate multiprocessors as
>> sort of a micro-meta-node where it has computational units connected
>> by very high speed network (shared memory). Not having thought about it
>> more than just to write this stuff down, it looks pretty elegant. I guess
>> I need to go off and push some chalk around the room for a while and
>> think about it some more....
>
>ISIS is moving towards hierarchical structures for just this reason.
>ISIS services would tend to have 2-3 processes per "active subgroup",
>perhaps a big envelope around the whole bunch per LAN, and inter-LAN
>tools for building WAN services.  We are close to having this now; the
>commercial ISIS (mid 1990) will include such a structuring facility.
>
>And, the interesting thing is that it stays pretty simple to program;
>structure doesn't always imply complexity.  

Hey, that's great. I wish more people would realize that the simpliest
viable solution is oftentimes the most desirable.

>I haven't looked much at multiprocessors but we are starting to think
>we should I somewhat doubt that you would want to use process groups
>internally on such machines, but who knows...

I get asked about that all the time. We have folks who really want to 
put MACH or V up on our box (not the same folks, BTW) to play with stuff
like that. From what I've looked at, it seems that what you'd get is
(maybe) *real* concurrency for the various processes in the "active
subgroup" (or thread or team or...) plus the advantages of shared memory
between the components, which would let you address a whole host of 
interesting problems that you can't address today.

>> BTW, if you're interested in fault tolerance, you need to snarf David B.
>> Johnson's dissertation from Rice. He did some excellant work on fault 
>> tolerance in message passing environments, even to the point of coming 
>> up with sort of a calculus for reasoning about tolerance requirements.
>> It should be available real soon -- he defended in October, but just 
>> recently got the offical copy over the the dean's office. His address
>> is "dbj@rice.edu" in case you need it.
>
>As I mentioned, I've read several drafts of the paper on this.  Not
>bad stuff, but there has been a lot of similar work (Borg's Auragen
>system, Toueg&Koo checkpointing mechanism) and this stuff has many
>limitations (determinism, no lightweight threads; only tolerates a
>single failure), plus it seems to deadlock under some conditions.
>An old copy of ISIS did something called "retained results" with
>similar limitations; we don't do this anymore because it seems to
>have been a so-so idea...  (But, for what its worth, I do think
>the Johnson/Zwaenpoel paper is better than any other paper on this
>type of message logging, mostly because of the performance figures)
>
>I haven't seen the calculus, though.  I'll ask for a copy of the
>thesis.  My comments relate to "sender based message logging".

Right. Same stuff. He added a whole lot of work to prove that for the 
cases he was considering, that his solutions where necessary and sufficient
to guarantee recovery. But good old "Mr. Meta-Problem" Dave went off
and developed what seems to be an excellent basis for reasoning about
fault tolerance in any distributed environment in order to accomplish this.

I'll be interested to hear what sort of responses people have to all this.
And, of course, real interested to see how ISIS works on one of our 
multiprocessors.

BTW, do you have ISIS for MACH yet? For what I'm looking at, it would 
give finer granularity than using OS/MP (our regular multiprocessor
version of SunOS).

Regards,

-- 
Stanley P. Hanks   Science Advisor                    Solbourne Computer, Inc.
Phone:             Corporate: (303) 772-3400           Houston: (713) 964-6705
E-mail:            ...!{boulder,sun,uunet}!stan!stan        stan@solbourne.com