Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!cbatt!ucbvax!sdcsvax!darrell
From: darrell@sdcsvax.UUCP
Newsgroups: mod.os
Subject: Submission for mod-os
Message-ID: <2699@sdcsvax.UCSD.EDU>
Date: Wed, 11-Feb-87 13:39:32 EST
Article-I.D.: sdcsvax.2699
Posted: Wed Feb 11 13:39:32 1987
Date-Received: Thu, 12-Feb-87 19:31:47 EST
Sender: darrell@sdcsvax.UCSD.EDU
Organization: V.U. Informatica, Amsterdam
Lines: 163
Approved: mod-os@sdcsvax.uucp

When is	an application ``distributed?''
=======================================

To answer this question, we will try to	find the borders of the
domain	of  distributed	applications.  It seems	to us that this
is better than describing an arbitrary point within the	 domain
by  summing  up	 some  properties that distributed applications
usually	have.  We will start by	naming four  extreme  examples,
each  running  on  two	processors.  We	take it	as self-evident
that there must	be at least two	processors involved in	a  dis-
tributed application.

Example	1.  One	processor calculates the odd-numbered  decimals
     of	 pi;  another the even-numbered	in parallel.  This is a
     distributed application.  Note that there is no communica-
     tion  involved, although there might be some in the end to
     merge the results.

Example	2.  We have a client process on	 one  processor	 and  a
     server process on another,	communicating using RPC.  There
     is	no parallelism,	 since	the  client  blocks  while  the
     server runs, but this still is a distributed application.

Example	3.  One	processor calculates decimals of pi; another is
     doing  a  compilation.  This is not a distributed applica-
     tion, although the	processors are running in parallel.

Example	4.  Two	unrelated processes communicate	while  contend-
     ing  for a	resource, such as a shared disk.  This is not a
     distributed application either, although there is communi-
     cation between the	processes.

The results can	be summarized as follows:

       example | distributed | parallel	| communication
       --------|-------------|----------|--------------
	  1    |     yes     |	yes	|      no
	  2    |     yes     |	no	|      yes
	  3    |     no	     |	yes	|      no
	  4    |     no	     |	maybe	|      yes

Examples 1 and 4 show that communication is neither a necessary
nor  a	sufficient property of distributed applications.  Exam-
ples 2 and 3 show that distribution  and  parallelism  are  not
directly related either.

    Definition.  A distributed application is an application
		 carried out by two or more processors.

This leaves the	terms ``application'' and ``processor''	 to  be
defined.   For	example,  a  data-flow machine is a distributed
system on a low-level, although	on the user level it is	neither
an  application	 nor distributed.  Another example is a	distri-
buted application running on a uniprocessor that is  simulating
a  distributed	system.	  A  distributed application running on
this simulator is still	a distributed application, although  in
reality	 there	is  only  one  processor  and  no communication
involved at all.  We believe that everybody has	intuitive ideas
of  what  an  application and what a processor is, and will not
try to obscure these ideas with	a formal definition.

What is	a distributed system?
=============================

We have	defined	a distributed application, but as we have  seen
in  the	 example  of  a	 distributed  application  running on a
uniprocessor, a	distributed application	does not imply	a  dis-
tributed  system.   What is a distributed system, then?	 Again,
we try to define the borders of	the  domain.   We  assume  that
there must be two or more processors.

Example	1.  There are two computers in the same	room which  are
     not  physically  connected	 in  any way.  Whether they are
     both computing parts of the same application or not,  they
     do	not make up a distributed system.

Example	2.  Two	computers are connected	by an  Ethernet.   They
     are  not  communicating  over the network,	and never have;
     still, the	possibility for	communication exists.  They are
     a distributed system.

Example	3.  Two	processing units, each	with  their  own  local
     memory,  have  access  to	a  global bus and common shared
     memory.  They compute independently, but  can  communicate
     through the shared	memory.	 They are a distributed	system.

Example	4.  A computer is made up, among other components, of a
     CPU  and  a  disk drive.  The disk	drive is a slave to the
     CPU, but once it has received a request from the  process-
     ing  unit,	 it  does some processing independent of and in
     parallel with the CPU.  On	the level of a systems program-
     mer,  this	 is  a	distributed  system.   There is	no real
     difference	in driving a disk controller or	a network  dev-
     ice, especially if	the network is reliable.  On the user's
     level, the	system is not distributed.

Example	5.  A processor	is a dedicated file server.   It  is  a
     slave  to the requests of other processors	connected to it
     through a network,	but after receiving a request, it  does
     some  processing  independent  of and in parallel with the
     requesting	processor.  On the level of a systems  program-
     mer,  this	 is a distributed system.  On the user's level,
     it	is not.

	       | distributed |	  shared     |
       example |   system    |	  memory     | communication
       --------|-------------|---------------|--------------
	  1    |     no	     |	    no	     |	    no
	  2    |     yes     |	    no	     |	 possible
	  3    |     yes     |	    yes	     |	    yes
	  4    |     yes     | yes (sort of) |	    yes
	  5    |     yes     |	    no	     |	    yes

The  examples  suggest	the  following.	  The  possibility  for
communication  is  necessary  for  a  distributed  system.  The
method of communication	is irrelevant;  whether	the  processors
communicate  via  a network, shared memory, or device registers
(in the	case of	the disk drive), they still make up  a	distri-
buted  system.	A given	system can be distributed on one level,
and not	distributed on another,	higher,	level.	 Most,	if  not
all  systems  are distributed on some level, hardly ever on the
user's level.

    Definition.  A distributed system consists of two or more
		 processors which have the ability to commun-
		 icate with one another.

What is	a distributed operating	system?
=======================================

The set	of distributed systems includes	 not  only  distributed
operating  systems, but	also distributed database systems, dis-
tributed airline reservation systems, and systems with multiple
operating  systems,  such  as the UNIX BSD4.3 operating	system.
This last system, however, is not a distributed	operating  sys-
tem,   since  it  does	not  support  distributed  applications
directly.  So what features should an operating	system for mul-
tiple  computers  have	before	it deserves the	title ``distri-
buted?''

This is	where we get into the fuzzy areas.  ``When is a	 system
an  operating  system?''  is a similar question.  A system that
simplifies disk	access in the form of  files  is  an  operating
system.	  A  system that supports multi-tasking	is an operating
system.	 A distributed operating system	must include  at  least
these  features	 in combination	with the ability to communicate
between	machines.

But this is not	enough.	 A distributed application  needs  sup-
port  for  starting  processes on different processors,	support
for communication between the processes	 independent  of  where
they  run,  load  balancing and	fault tolerance	mechanisms, and
control	 for  distributed  processes  (signaling,  etc.).    An
operating   system  that  supports  some  of  these  mechanisms
directly can be	called	distributed.   So,  although  the  UNIX
4.3BSD system could be made into a distributed operating system
by adding these	mechanisms to it, the way it currently	exists,
it is not.

			Jennifer Steiner (jennifer@cwi.nl)
			Robbert van Renesse (cogito@cs.vu.nl)
			The Amoeba Project
			Amsterdam