Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uunet!mcsun!sunic!kullmar!pkmab!ske From: ske@pkmab.se (Kristoffer Eriksson) Newsgroups: comp.lang.misc Subject: Re: Common subexpression optimization Message-ID: <2843@pkmab.se> Date: 9 Feb 90 21:26:46 GMT References: <4561@scolex.sco.COM> <14214@lambda.UUCP> <2217@sunset.MATH.UCLA.EDU> <4466@brazos.Rice.edu> <4616@brazos.Rice.edu> Organization: Peridot Konsult i Mellansverige AB, Oerebro, Sweden Lines: 107 In article <4616@brazos.Rice.edu> preston@titan.rice.edu (Preston Briggs) writes: >I'm still hoping to see somebody recode matrix multiply in C, using >all the source hacks they want, so we can compare it to what we get >using Fortran and our (not mine, friends') latest transformations. > >We'd run it on a MIPS M/120, with 32 integer registers and 16 fp regs, >The Fortran source is > > subroutine mm(A, B, C, n) > do 1 i = 1, n > do 1 j = 1, n > A(i, j) = 0.0 > do 1 k = 1, n >1 A(i, j) = A(i, j) + B(i, k) * C(k, j) > end I'm not a Fortran programmer, so I am not sure exactly how this routine works. I guess that A, B, and C are arrays and n is the maximum index in these arrays that is to participate in the multiply, and that the array bounds are automatically passed to routine, without explicit mention. That is not "natural" in C, thus we have the problem of what to do instead... Most C compilers don't accept variable size array arguments. The simplest way to code exactly the same functionality without that facility, I think, is to use a macro. The compiler will have the information about the array bounds available for the arrays the macro is used on. Each of the three arrays can have a different size, and they need coincide with n. Maybe you never would use it that way, but it is the same as what I suppose the Fortran routine does. ----cut---- #define mm(A, B, C, n) \ { \ register unsigned idxsize k, j, i; \ register float acc; \ \ for(i = 0; i < n; ++i) { \ for(j = 0; j < n; ++j) { \ acc = 0.0; \ for(k = 0; k < n; ++k) \ acc += B[i][k] * C[k][j]; \ A[i][j] = acc; \ } \ } \ } /* Simple test case */ typedef short idxsize; float a[22][16], b[17][32], c[45][16]; main() { mm(a,b,c,16); } ----cut---- I don't bother using pointers (or reversing the looping direction). I don't see any reason why the compiler should not have the possibility of optimizing the array indexing on it's own. Probably many compilers don't do that, but I think it should in principle be possible for the C-compiler to optimize this routine in the same way as the Fortran compiler does. (This being a macro, the C-compiler will even generate in-line code and use constant array sizes, even without global code analyses.) I don't know about MIPS' compilers. If I instead decide that n gives the array size (maximum indexes) for all arrays, I can write a simple function for it: ----cut---- /* Simple test case */ typedef short idxsize; float a[16][16], b[16][16], c[16][16]; main() { mm(a,b,c,16); } /* Matrix multiply. */ mm(A, B, C, n) register n; register float *B, *C; float *A; { register unsigned idxsize k, j, i; register float acc; for(i = 0; i < n; ++i) { for(j = 0; j < n; ++j) { acc = 0.0; for(k = 0; k < n; ++k) acc += B[i * n + k] * C[k * n + j]; A[i * n + j] = acc; } } } ----cut---- I could write a function that takes differing array sizes too. It isn't very hard, but there are many ways to do it to choose among. If nothing else works, it should be possible to explicitly code most of the things the Fortran compiler might do automatically. That's the character of C, for good or worse. -- Kristoffer Eriksson, Peridot Konsult AB, Hagagatan 6, S-703 40 Oerebro, Sweden Phone: +46 19-13 03 60 ! e-mail: ske@pkmab.se Fax: +46 19-11 51 03 ! or ...!{uunet,mcvax}!sunic.sunet.se!kullmar!pkmab!ske