Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!think!yale!cmcl2!lanl!lambda!jlg From: jlg@lambda.UUCP (Jim Giles) Newsgroups: comp.lang.misc Subject: Re: Arrays in languages (was: Anyone want to design a language?) Message-ID: <14256@lambda.UUCP> Date: 28 Feb 90 20:11:26 GMT References: <8849@boring.cwi.nl> Lines: 58 From article <8849@boring.cwi.nl>, by dik@cwi.nl (Dik T. Winter): > [...] > > ... > (About generating equivalent code.) > Again aliassing is a problem. But the code I presented has no problems > with aliasses unless used on multi-processor systems (only innerproducts > are used, so vector processors have no problems, unless they are not good > at that, and some are). Actually, not true. At least one form of optimization (loop unrolling) is inhibited by possible aliasing in your example code. Loop unrolling can be effective in speeding up code on purely scalar machines. A friend of mine has an algorithm that runs 15% faster on an IBM/PC when the loops are unrolled once. In your example, the inner loop can be unrolled, but not the outer one. In addition, the vectorization you claim possible won't happen since the operation in the inner loop is a reduction operation (it sums all the elements of a vector into a single scalar). Many vector machines don't have reduction operators in their vector instruction set. For those that do, the instruction is usually nearly as slow as the equivalent scalar loop. The most efficient vector form of your algorithm is to put the summation on the outer loop and vectorize that way. But this would be inhibited by aliasing in the C version. (I reproduce the loops here for anyone who is still trying to follow this argument.) for(i = l1m; i <= u1m; i++) { sum = 0.0; for(j = l2m; j <= u2m; j++) { sum += M(i,j) * V(j); } W(i) = sum; } The faster form is: for(i = l1m; i <= u1m; i++) W(i) = 0.0; for(j = l2m; j <= u2m; j++) { for(i = l1m; i <= u1m; i++) { W(i) += M(i,j) * V(j); } } Here, _both_ the add and the multiply vectorize. But, C can't vectorize this. > [... trying to detect aliasing at run-time ...] > [...] A simple case I encountered was in a long integer package. The > routine to subtract numbers 'SUB(A, B, C)' took two input arrays A and B > and one output array C. An attempt was made to detect whether A and B > where the same. The test failed on at least one system. If A and B are input only, then aliasing in Fortran _IS_ allowed since the subroutine doesn't do anything which would make aliasing cause problems. Aliasing is only a problem if one or the other of the aliased names is the target of an assignment. I can see why they wanted to detect the aliasing - to save time (return zero without actually having to do the subtract). However, failure to detect the aliased inputs will no change the _MEANING_ of the code, only its efficiency. J. Giles