Xref: utzoo comp.lang.c:34658 comp.lang.fortran:4364 Path: utzoo!attcan!uunet!zaphod.mps.ohio-state.edu!van-bc!ubc-cs!uw-beaver!uw-june!david From: david@cs.washington.edu (David Callahan) Newsgroups: comp.lang.c,comp.lang.fortran Subject: Re: alternatives to "noalias"? Message-ID: <14065@june.cs.washington.edu> Date: 10 Dec 90 16:29:29 GMT References: <221@validgh.com> <1990Dec9.003028.13424@zoo.toronto.edu> Reply-To: david@june.cs.washington.edu (David Callahan) Organization: Tera Computer Co., Seattle WA Lines: 46 In article tmb@bambleweenie57.ai.mit.edu (Thomas M. Breuel) writes: >Many optimizations that are possible because the compiler can assume >that two different names don't refer to the same location can be >expressed portably introducing explicit temporaries. ... >What I would like to ask: ignoring programming style, can you think of >any optimizations that a FORTRAN compiler can carry out that cannot be >expressed portably using temporaries and one or two compiler >directives declaring absence of sequential dependencies in a loop >body? I'm not entirely sure what you mean by ``sequential dependencies'' but I would note that restructuring compilers are not interested only in the absence of loop carried dependences (which inhibit simple conversion of a loop to to parallel form) but how the dependences which do exist can be used. Consider the loop: do i = 1,n do j = 1,m f(i,j) = f(i,j) + a(i,j)*f(i-1,j) + b(i,j)*f(i-2,j) a simple second order recurence which has been vectorized in one dimension. The output I would hope for from the current generation of research restructurers depends on the target machine. For a non-vector machine, something like: do all j = 1,m f2 = f(i-2,j) f1 = f(i-1,j) do i = 1,n f0 = f(i,j) f(i,j) = f0 + a(i,j)*f1 + b(i,j)*f2 f2 = f1 ; f1 = f0 where the assignments will later be eliminated by unrolling and subsumption. On a vector machine with registers of length L, the j loop is blocked and the f0,f1,f2 become vector temps. In addition to the parallelism, the loop now has 1/3 fewer memory operations. In this example, the critical information is that there is no alias between f and any other variable, not that the j loop carries no dependences. -- David Callahan (david@tera.com, david@june.cs.washington.edu,david@rice.edu) Tera Computer Co. 400 North 34th Street Seattle WA, 98103