Path: utzoo!attcan!uunet!stealth.acf.nyu.edu!brnstnd From: brnstnd@stealth.acf.nyu.edu Newsgroups: comp.lang.misc Subject: Re: Common subexpression optimization Message-ID: <838.18:06:33@stealth.acf.nyu.edu> Date: 8 Feb 90 00:06:34 GMT References: <4561@scolex.sco.COM> <14214@lambda.UUCP> <2217@sunset.MATH.UCLA.EDU> Reply-To: brnstnd@stealth.acf.nyu.edu (Dan Bernstein) Organization: IR Lines: 58 Piercarlo is entirely correct. If you care about ``information hiding'' and the ``purity'' of your code more than efficiency, you can't expect fast code. If you can't bear to put your original code in comments and write an efficient version with temporary variables, you don't deserve fast code. In article <2217@sunset.MATH.UCLA.EDU> pmontgom@math.ucla.edu (Peter Montgomery) writes: > In my programs, common subexpressions often appear > as a result of macro expansions. Others are not visible at the > source level, but arise when producing the object code. When efficiency is important, write out the expressions. If the efficient version is different on different machines, write it out separately for each. There is a limit to what the compiler can optimize; a competent programmer assumes merely that the compiler understands scheduling and register allocation, and does the rest of the job by hand. > There was recently a discussion in comp.arch about the lack of > an integer multiply instruction on the SPARC, and how to circumvent it. > In one of my programs (for multiple precision arithmetic), the most > time-critical inner loop requires four sums of products (all variables integer) [ the expressions he wants, and how he implements them on the SPARC with macros referencing a table of triangular numbers ] Actually, the code would be faster if you used squaring rather than triangular numbers. Do you expect the compiler to figure this out, or should the programmer perform this optimization? Using squaring as the foundation for multiprecision multiplication (i.e., to multiply arrays a and b, square a+b, square a-b, subtract, and divide by 4) saves a lot of space (and code). Do you expect the compiler to figure this out, or should the programmer perform this optimization? > it would defeat information hiding if I expanded everything > there and removed the common subexpressions explicitly. Tough. Efficiency demands code space. You can leave the ``pure'' expressions in the code, commented out by #ifdef SLOW_N_PORTABLE. Supposedly optimizing compilers don't remove the burden of thought from programmers. [ more optimizations, based on loop invariants ] > I cannot make these optimizations at the source level. Again, you can and should perform this optimization by hand, sacrificing your macros. Feel free to leave the original code there, but don't pretend that the compiler can figure things out better than you can. > Even on systems using the first macro expansion > (in terms of multiplications), there may be hidden common subexpressions. So don't trust Cray's compiler to figure out that it should store the fp variables. How can the compiler tell the difference between a bottleneck and a loop that's executed only twice? How can the compiler know whether variable a is used more than variable b in a function? How can the compiler know whether you prefer code space, compiling effort, memory, or time? ---Dan