Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!rice!titan!preston From: preston@titan.rice.edu (Preston Briggs) Newsgroups: comp.lang.misc Subject: Re: Common subexpression optimization Message-ID: <4466@brazos.Rice.edu> Date: 2 Feb 90 01:14:48 GMT References: <4561@scolex.sco.COM> <14214@lambda.UUCP> <2217@sunset.MATH.UCLA.EDU> Sender: root@rice.edu Organization: Rice University, Houston Lines: 90 In article pcg@rupert.cs.aber.ac.uk (Piercarlo Grandi) writes: many interesting things... He also gives some examples of how we might notate matrix multiplication so that a comparitively simple compiler can generate good code. > for i in 1..m > let s[] alias c[i,], > fast ronly h[] alias a[i,] > for j in 1..n > let fast e alias s[j] initially 0, > fast ronly v[] alias b[,j] > for fast k in 1..o > e += h[k]*v[k] > >(which can be readily written in Algol 68, which was *carefully* >designed for easy, efficient code generation) which is vastly >more descriptive as well, as it makes obvious we are using a >vector product on one hand and that we *know* that e,h,v,k are >the variables to cache, the others are irrelevant, and that the >compiler need not bother "optimize" ? My impression has been that the generality of Algol-68 prohibits efficient code. Nevertheless, the example is interesting. Note that we'd also like to see m*4 in a register (assuming row major array ordering and 4 bytes/word) and k itself eliminated. The point being that the "fast" hint has some perhaps unexpected side-effects. >Then the following version > > for s[] over c[*,], ronly h[] over a[*,] > for p over s[*], ronly v[] over b[,*] > p = h innerprod v > > (ronly h[]) innerprod (ronly v[]) is > let fast d initially 0 > for fast ronly h_i over h[*], fast ronly v_j over v[*] > d += h_i*v_j; > return d > >would be in many ways less procedural and highly informative, >and allow for a lot of safe optimization (coupled with some heuristic >on expected access frequencies), especially on vector machines. If you'd throw out the "ronly" and "fast" hints, I'd say these forms aren't too bad. Why do you need to say "ronly" when it's fairly obvious to the compiler that h and h_i are not assigned? These hints seem like non-abstract intrusions into the middle of otherwise fairly declarative code. Both these examples are written in a relatively high-level notation. My approach to compiling either notation would be to translate quickly into a fairly low-level form with very little analysis of the original code. To simplify my life, I'd make the translation as straightforward as possible, generating deliberately naive code and concentrating on producing a correct translation. Then I'd optimize, using a few powerful, general purpose optimizations, (i.e. partial redundancy elimination, strength reduction, dead code elimination, value numbering). Finally, I'd generate object code from the optimized intermediate code. I'd use a global register allocator to assign registers to the important variables and temporaries. Note the seperation of concerns. Enforcing the language in the front end, improvement (without changing the meaning of the program) in the middle, and concern for the machine details in the back end. So what happened to the "fast" hints (and register declarations in C)? They get thrown out. It's easier, more effective, and more complete to discover opportunities for optimization during the latter stages of compilation. But let's imagine a slick compiler that can make the code you'd like to see for either of these examples. There exist Fortran compilers that can produce faster code. Why? They can make fuller use of the machines resources and they're willing to generate the ugly code necessary. You mustn't misunderstand me and think that I code in Fortran; but I do recognize that Fortran compilers know a lot about DO-loops and arrays. If other languages had similar restrictions on their arrays and loop structures, their compiler could achieve similar levels of optimization. It's also possible that the notation outlined by PCG would support fairly deep analysis; I'm not sure from the examples. Preston Briggs preston@titan.rice.edu