Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!rutgers!cmcl2!lanl!lambda!jlg From: jlg@lambda.UUCP (Jim Giles) Newsgroups: comp.lang.misc Subject: Re: Common subexpression optimization Message-ID: <14225@lambda.UUCP> Date: 2 Feb 90 02:08:23 GMT References: Lines: 97 From article , by pcg@rupert.cs.aber.ac.uk (Piercarlo Grandi): > [...] > Consider the usual fragment for matrix product (dimensions are > [m,o] [o,n]), written in some pseudo language: > > for i in 1..m > for j in 1..n > c[i,j] = 0 > for k in 1..o > c[i,j] = c[i,j] + a[i,k]*b[k,j] > > Of course there are ample opportunities for common subexpression > elimination here, and other hairy optimizations. What about instead: > > for i in 1..m > let s[] alias c[i,], > fast ronly h[] alias a[i,] > for j in 1..n > let fast e alias s[j] initially 0, > fast ronly v[] alias b[,j] > for fast k in 1..o > e += h[k]*v[k] > > (which can be readily written in Algol 68, which was *carefully* > designed for easy, efficient code generation) which is vastly > more descriptive as well, [...] Yes, it is more descriptive. But, it is descriptive of all the wrong things. Here, you are describing things for optimization purposes which the programmer (with today's state-of-the-art) RIGHTLY assumes the compiler sould be able to find for itself. In fact, the only reason that programmers accept the above Fortran-ish loop notation is that (up to now) the compiler technology hasn't been capable of doing better. The mathematical notation for this is simply: C = A B Now, you can convince a user that this is ambiguous: is it element-wise array multiplication or matrix multiplication? Well, this requires a rewrite. The experienced mathematician will then suggest: n o n C = A B m m o This uses a convention that upper- and lower-indices that match on the right are summed (sometimes called the 'Einstein summation convention'). Well, now you might conmvince the programmer that the above notation is ill suited to keyboard/terminal usage since these devices are (so far) not well suited to the two-dimensional layout given. So, you might end up with: C(m,n) = A(m,o) * B(o,n) This is fine, except that now it matches the notation for a simple scalar multiply using the current values of m,n, and o. There needs to be some way of explicitly representing the fact that m, n, and o are iteration variables and not simple subscripts. Hence, the loop notation that began this message. But, the mathematical programmer has legitimate reason to expect that, as compiler/language technology improves, the notation will converge back to the original mathematical notation. So, by today's state-of-the-art, you might expect a language to accept AND OPTIMIZE the following: C(#m,#n) = A(#m,%o) * B(%o,#n) Where the sharp (#) indices are iterated over and the percent (%) indices are summed over. This may be the best we can expect until keyboards and fonts improve (which is already happening). Still, the correct direction for programming languages to move is toward notation for _people_ to use, not toward notation to make (already standard) optimizations easier. > [...] > 2) Programming is not the same as mathematics. You often have to > change *radically* technique when writing a program. [...] Quite true. But, the key phrase in what you've said is "have to". You shouldn't change your notation (radically or not) unless you have to. Even then you should change as little as you have to. As time goes on, it is legitimate to expect that these occasions when I have to change notation will become less frequent. It is toward that goal that language design should be turned. > [... "register" attribute in C "a big win" ...] The only difference between a register variable and any other local variable in C is that you cant use the address-of (&) operator on register variables. In languages which don't have pointers (like Fortran 77) this is a moot distinction. In languages in which pointers can only point to dynamic memory and are the only way to do so (like Pascal and Modula2) this is also an irrelevant distinction. In fact, the only use of the "register" attribute is to be able to declare variables that can't be aliased - an irrelevant distinction in MOST languages where variables have that property by default. J. Giles