Xref: utzoo comp.lang.c:34641 comp.lang.fortran:4362
Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!think.com!mintaka!ai-lab!ai.mit.edu!tmb
From: tmb@ai.mit.edu (Thomas M. Breuel)
Newsgroups: comp.lang.c,comp.lang.fortran
Subject: Re: alternatives to "noalias"?
Message-ID: <12337@life.ai.mit.edu>
Date: 11 Dec 90 01:05:32 GMT
References: <1990Dec9.003028.13424@zoo.toronto.edu> <TMB.90Dec10035836@bambleweenie57.ai.mit.edu> <14065@june.cs.washington.edu> <12873:Dec1021:09:2890@kramden.acf.nyu.edu>
Sender: news@ai.mit.edu
Reply-To: tmb@ai.mit.edu
Organization: MIT Artificial Intelligence Lab
Lines: 50

In article <12873:Dec1021:09:2890@kramden.acf.nyu.edu>, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
|> In article <14065@june.cs.washington.edu> david@june.cs.washington.edu (David Callahan) writes:
|> > I'm not entirely sure what you mean by ``sequential dependencies''
|> 
|> That's exactly the problem I've been wrestling with. Wtf is a sequential
|> dependency? Does it mean that the result of the loop is changed if the
|> index goes backwards or in a random order? (This isn't good enough.)
|> Does it mean that no memory location accessed during the loop is written
|> during the loop? (This may be too restrictive, and it may not be good
|> enough.) What does it mean?

There are several useful abstractions:

* all expressions inside the loop body are evaluated in parallel
  (in particular, assignments). If multiple assignments
  to the same location occur, the effect of the loop is undefined 
  (useful for SIMD machines; "parallel model").

* if the effect of a loop depends in any way on the order in which
  the loop index assumes its values, the effect is undefined (i.e.,
  need not even correspond to any order in which the index could assume
  its values). This seems to be a more restrictive version of
  the PARALLEL_DO loop ("random model"), and is useful as a least
  common denominator between vectorizing and parallel machines.

* the loop index assumes its value sequentially, but the compiler
  is permitted to delay assignment to any location that is not
  explicitly declared "register" arbitrarily. This is a useful
  abstraction for vectorizing machines ("vector model"). (In C,
  register declarations do not force the compiler to put variables
  into a register; they do ensure that the location can never
  be aliased. The messiness of the "vector model" is simply a
  result of the fact that vectorizing computation itself is conceptually
  much messier than SIMD parallelism, since it allows for some
  sequential dependencies).

|> If the programmer writes
|> 
|>   forall(i = 0;i < 100;i++) x[i] = y[i];
|> 
|> then the compiler had better be able to conclude that x and y don't
|> overlap. 

No, they may overlap. In fact, if x and y point to the same location,
this loop is correct under any of the three looping constructs. If
x < y, then the loop is correct under the vector model. And, the
loop is correct for any aliasing under the parallel model.

|> That's what we want out of the ``sequential dependency''
|> definition. But how can we achieve it?