Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!emory!hubcap!fpst
From: pratt@cs.stanford.edu
Newsgroups: comp.parallel
Subject: Re: New i860 parallel machine
Message-ID: <1991Apr22.123037.2375@hubcap.clemson.edu>
Date: 21 Apr 91 17:47:06 GMT
Sender: fpst@hubcap.clemson.edu (Steve Stevenson)
Organization: Clemson University
Lines: 128
Approved: parallel@hubcap.clemson.edu
In-Reply-To: Your message of Fri, 19 Apr 91 12:44:09 -0400.
	     <9104191644.AA17015@hubcap.clemson.edu>
To: parallel@coraki.Stanford.EDU

In article <9104191644.AA17015@hubcap.clemson.edu>
hugo@griggs.dartmouth.edu (Peter Su) writes:

>One of the assumptions behind this statement is that concurrency as it
>is defined by computer science (i.e. The interaction of independent
>processes executing programs asynchronously, posets, partial orders
>among events in time, temporal logic, whatever) is relavent to the
>construction of parallel machines and algorithms for computational
>problems.

>As has been noted before (see IEEE Software, Sept. 1990), this
>definition of concurrency, and the formal models surrounding it are
>all motivated in some sense by the design and implementation of
>operating systems.  One could argue that they are largely irrelevant
>to the design and implementation of parallel algorithms.

Denying a connection between operating systems and parallel algorithms
is like denying a connection between planetary motion and gyroscope
behavior.  If your account of planetary motion is in terms of Ptolemaic
epicycles then there is no connection.  But if your account is in terms
of Newton's laws then there is a very beautiful connection.

The "epicycle theory" of operating systems is the view of
asynchronously communicating sequential processes mediated by buffered
channels.  This view sheds little light on parallel algorithms other
than those that clearly incorporate such channels.  In contrast
"Newton's laws" for operating systems view their activity in terms of
the dual notions of event occurrences and state trajectories, which is
as much the basis of parallel algorithms as of operating systems and
can shed much light on both.

>For example, in the theoretical computer science community, most
>parallel algorithms are designed not with concurrent process algebras,
>pomsets, or any of that.  Rather, the model of computation is SPMD
>shared memory.  In other words, the emphasis is on operating on
>parallel data structures, not on coordinating concurrent processes.

These are only distinguishable if you insist on too narrow a definition
of "coordinating concurrent processes."  Presumably you are thinking in
terms of coordination by channels, semaphores, monitors, etc., which
certainly do not feature prominently among the coordination primitives
used by many parallel algorithms.

A major shortcoming of parallel programming today is a lack of good
primitives to serve as a basis for structured parallel programming.  A
decent process algebra will supply such primitives.  These will be of
at least as much help in writing, understanding, and verifying SIMD
programs as if-then-else's and while-do's are for sequential programs.
Then you will find people starting to design their parallel algorithms
with concurrent process algebras (for which pomsets are just one of
several plausible models) instead of with sequential control
structures.  Although a good deal of progress in that direction has
been made during the 1980's, we still have a long way to go before that
vision is fully realized.

>The reason for this is simple, the amount of parallelism that you can
>obtain scales naturally with the size of the problem.  So, the most
>natural way to get a lot of parallelism is to write one program, and
>replicate it.

That approach constitutes the assembly language of parallel
programming: Turing powerful but devoid of structure.  In mathematics
you could argue with equal logic that the natural way to build a vector
space is start with one point and replicate it.  That gets you as far
as the underlying set of the space, and indeed to many people a
geometric space *is* the set of its points.  But that doesn't do
justice to vector spaces.  Set theory is only the assembly language of
mathematics.  A vector space adds structure to that set,
differentiating it from other set-based structures such as lattices and
graphs and grammars.  Types are more than just sets, and parallel
programs are more than just replicated programs: they have useful
structure.  If you don't see that structure you don't understand your
program efficiently.

Structured programming was a hard sell in 1970, and it doesn't seem to
have gotten much easier 20 years later.  When it's available and
working smoothly many people buy it and use it, yet do not acknowledge
it.  And when it's not available they insist they're doing just fine
without it.  Some acknowledge it, but it is depressing that so many do
not.

In between are those that acknowledge the past but not the future:
they like what's happened so far but refuse to believe that anything
more can happen.  They believe that all the good inventions in
structured programming have already been invented and it is now a
closed subject.

There is some structure available now for structured parallel
programming, but nowhere near enough in my opinion.  A whole lot more
work is needed before the tools are in place to create really
insightful representations of concurrent programs.  Tools based on the
view that a concurrent program is a collection of communicating
sequential processes, or that it is the sum of its execution sequences,
are not universal enough to be useful to people working in parallel
programming.  We need much better views in order to understand our
parallel programs insightfully.  Better views of concurrent computation
are among the most important goals of modern concurrency theory
research.

>Is it really realistic to expect that people could
>program a 2K MIMD machine susing two thousand separate programs?  I
>don't think so.  

This has nothing to do with concurrency theory, which neither limits
itself to MIMD nor insists that the n programs running on an
n-processor MIMD machine all be distinct.

>Thus, the main problem in parallel computing from my point of view
>(programmer and algorithm designer) is how to manage the placement and
>movement of *data* in the machine.  So, I wonder what all the theories
>about concurrency have to do with this.

They have everything to do with it.  They tell you how to think about
it happening in parallel.  If you have a better theory of how data can
move and interact concurrently than the present theories of concurrency
then you're a contributor to this field whether you admit it or not.
If you don't have a better theory then one must infer from your
objections (to the work of those who do) that you don't believe the
parallel programming world needs a theory of data movement and
interaction.

	Vaughan Pratt

-- 
=========================== MODERATOR ==============================
Steve Stevenson                            {steve,fpst}@hubcap.clemson.edu
Department of Computer Science,            comp.parallel
Clemson University, Clemson, SC 29634-1906 (803)656-5880.mabell