Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!yale!cmcl2!lanl!jlg From: jlg@lanl.gov (Jim Giles) Newsgroups: comp.lang.fortran Subject: Re: Fortran vs. C for numerical work (SUMMARY) Message-ID: <7216@lanl.gov> Date: 28 Nov 90 20:25:55 GMT References: Organization: Los Alamos Natl Lab, Los Alamos, N.M. Lines: 53 From article , by burley@pogo.ai.mit.edu (Craig Burley): < [...] < Hmm, could you explain to me how to strength-reduce multiplication to addition < for random references to array elements within loops, as in < < SUBROUTINE FOO(A) < REAL A(37,103) < INTEGER I,J < DO I,1=100 < DO J,1=100 < A(SOMEFUNC(I),OTHERFUNC(J)) = 0. < END DO < END DO < END < < where SOMEFUNC and OTHERFUNC may be external functions or statement functions < with expressions sufficient to ensure that there is no linearity between their < inputs (I and J) and their outputs (used as array indices)? I don't understand what relevance this example has to the topic under discussion. The fact that this can't be strength reduced is irrelevant to the question of whether (so-called) flat arrays are more efficient than multiple indirections (pointer-to-pointer arrays). In this example, the flat array requires (in addition to the function references) a multiply and two adds in the inner loop (maybe another add as well, if the array is not zero based and not static). The pointer-to-pointer array version would require two adds and an extra memory reference - memory is almost always slower than int multiplies. In the most common case (where the indices are _not_ sent to functions), the strength reduced reference of a flat array costs a single increment instruction in the inner loop; the pointer-to-pointer implementation requires the same add (to get to the next element in the array) AND it _may_ have to do an additional memory reference and another add (if the inner loop is running 'against the grain' of the array - with the outer pointer striding fastest: the flat array works just as efficiently regardless of the order of the loops). Finally, if the array has more than two dimensions, the number of additional memory references in the inner loop _may_ go up with the number of dimensions. The flat array either gets strength reduced to a single stride or (in the case you give) has as many extra mults as the pointer-to-pointer version has extra memory traffic. So, in the _usual_ case (where the array is indexed directly by the loop counters), pointer-to-pointer just breaks even IF the loops are nested in the same order as the array - otherwise you always lose. In your _unusual_ case, the pointer-to-pointer version is still slower except on those rare machines where memory references are cheaper than mults. And this still ignores the extra memory required by the arrays of pointers! J. Giles