Xref: utzoo comp.lang.c:34641 comp.lang.fortran:4362 Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!think.com!mintaka!ai-lab!ai.mit.edu!tmb From: tmb@ai.mit.edu (Thomas M. Breuel) Newsgroups: comp.lang.c,comp.lang.fortran Subject: Re: alternatives to "noalias"? Message-ID: <12337@life.ai.mit.edu> Date: 11 Dec 90 01:05:32 GMT References: <1990Dec9.003028.13424@zoo.toronto.edu> <14065@june.cs.washington.edu> <12873:Dec1021:09:2890@kramden.acf.nyu.edu> Sender: news@ai.mit.edu Reply-To: tmb@ai.mit.edu Organization: MIT Artificial Intelligence Lab Lines: 50 In article <12873:Dec1021:09:2890@kramden.acf.nyu.edu>, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: |> In article <14065@june.cs.washington.edu> david@june.cs.washington.edu (David Callahan) writes: |> > I'm not entirely sure what you mean by ``sequential dependencies'' |> |> That's exactly the problem I've been wrestling with. Wtf is a sequential |> dependency? Does it mean that the result of the loop is changed if the |> index goes backwards or in a random order? (This isn't good enough.) |> Does it mean that no memory location accessed during the loop is written |> during the loop? (This may be too restrictive, and it may not be good |> enough.) What does it mean? There are several useful abstractions: * all expressions inside the loop body are evaluated in parallel (in particular, assignments). If multiple assignments to the same location occur, the effect of the loop is undefined (useful for SIMD machines; "parallel model"). * if the effect of a loop depends in any way on the order in which the loop index assumes its values, the effect is undefined (i.e., need not even correspond to any order in which the index could assume its values). This seems to be a more restrictive version of the PARALLEL_DO loop ("random model"), and is useful as a least common denominator between vectorizing and parallel machines. * the loop index assumes its value sequentially, but the compiler is permitted to delay assignment to any location that is not explicitly declared "register" arbitrarily. This is a useful abstraction for vectorizing machines ("vector model"). (In C, register declarations do not force the compiler to put variables into a register; they do ensure that the location can never be aliased. The messiness of the "vector model" is simply a result of the fact that vectorizing computation itself is conceptually much messier than SIMD parallelism, since it allows for some sequential dependencies). |> If the programmer writes |> |> forall(i = 0;i < 100;i++) x[i] = y[i]; |> |> then the compiler had better be able to conclude that x and y don't |> overlap. No, they may overlap. In fact, if x and y point to the same location, this loop is correct under any of the three looping constructs. If x < y, then the loop is correct under the vector model. And, the loop is correct for any aliasing under the parallel model. |> That's what we want out of the ``sequential dependency'' |> definition. But how can we achieve it?