Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!purdue!tut.cis.ohio-state.edu!cs.utexas.edu!rice!uw-beaver!Teknowledge.COM!unix!hplabs!hp-sdd!ncr-sd!ncrcae!hubcap!chkim%pollux.usc.edu
From: chkim%pollux.usc.edu@usc.edu (Chinhyun Kim)
Newsgroups: comp.parallel
Subject: Dynamic parallelism
Message-ID: <7021@hubcap.clemson.edu>
Date: 12 Nov 89 18:36:16 GMT
Sender: fpst@hubcap.clemson.edu
Lines: 46
Approved: parallel@hubcap.clemson.edu


In numerical analysis, arrays being indexed by other arrays as shown below
are supposed to be very common in programs involving sparse matrices,
fluid flow simulations, etc.  If one wanted to parallelize this loop during
compilation, he/she would have to know the values of B(i) and C(i).  If they
are not known at compile time, it will be impossible to parallelize the
operations in the loop and thus must be executed in sequential manner.

There are, however, ways to parallelize the operations in the loop by checking
data dependency during run time and letting operations which are free of data
dependency to execute in parallel. This kind of parallelism which can be
determined only during run time is called, "dynamic parallelism".

A(i) = .....		; A(i)'s are initialized.
B(i) = ....		; B(i)'s are determined.
C(i) = ....		; C(i)'s are determined.

DO i = 1..N
    A(B(i)) = .....	        ; Array A is updated.
    T(i) = A(C(i)) + ..	        ; Array A is consumed for some operation.
ENDDO

I am currently working on ways to exploit dynamic parallelism as opposed to
static parallelism which the compiler assesses parallelism during compilation.
The model of computation is data-flow and the specific type of data-flow
computer model is tagged token data-flow computer.  I am using a mechanism
known as "token relabeling" to do this.

The results of my simulation so far is good.  However, I have some fundamental
questions to ask.

Q1. In exactly what type of numerical simulations do such loops appear and why
    are they necessary?

Q2. Are there other ways to do it without having the subscripts unknown at
    compile time, i.e. is exploitation of dynamic parallelism avoidable while
    still achieving parallel execution?

Q3. Is it possible for me to get the fragment of the simulation program which
    has such loop? i.e. assuming answer to the first part of question 1 is
    true.

Answers to these questions will be appreciated very much.

Chinhyun Kim
chkim@pollux.usc.edu