Path: utzoo!utgpu!cunews!cognos!jimp From: jimp@cognos.UUCP (Jim Patterson) Newsgroups: comp.arch Subject: Re: Optimising C compiler question Message-ID: <9537@cognos.UUCP> Date: 16 Apr 91 01:08:20 GMT References: <1991Apr8.193155.3911@vax5.cit.cornell.edu> <1991Apr11.003431.24918@alzabo.ocunix.on.ca> <9526@cognos.UUCP> <1991Apr15.134417.24380@bigsur.uucp> Reply-To: jimp@cognos.UUCP (Jim Patterson) Distribution: comp Organization: Cognos Inc., Ottawa, Canada Lines: 79 In article <1991Apr15.134417.24380@bigsur.uucp> coopteam@bcarh680.bnr.ca (Jean Laliberte) writes: >In article > <9526@cognos.UUCP> > jimp@cognos.UUCP (Jim Patterson) >writes: >>I don't think a compiler can ever know absolutely that a given variable >>should be in a register whereas another should not, for "performance". >> >>[description deleted] >>[example deleted] >> >Is there some reason register allocation couldn't also be dynamic? In >this example, i would be stored in the register for the most part, >but when the if() statement evaluates to true, 'store i' and 'load j' >instructions would be executed, and j would then be the register >variable (when the if() condition is over, i would be put back in the >register). There's no reason but it's obviously a trade-off as well. If you're loading it for a single reference, it might be more efficient to reference it on the stack. That is, there is a cost to loading and storing registers which needs to be factored into the overal cost computation. You also have to consider code which needs to reference both variables e.g. the test "j < i" in the example I gave. Actually, the point is really that the compiler need not put j into a register at all since it's in code that won't be executed most times the program is run. I agree, it's an over-simplification to assume that only one of the two variables can go into a register. Consider instead a situation where the "for-i" block either does it's "free-memory" loop using j, or another piece of logic using a variable "k" which is kept across iterations. for (i=0, k=-1; i<10000; ++i) { e=malloc(sizeof(element)); if (!e) { for (j=0; j<10000; ++j) if (j < i) free(table[j]); getout(); /* Leave routine now */ } else { if (k >= 0) table[k]->link = e; ++k; } table[i] = e; } table[k]->link = 0; If the compiler considers both paths equally likely it may continue to assign the single register to j, or could dump and load j and k into the single register as required. However, we have information the compiler doesn't, to wit that the "j" path will not normally be executed, so we can see that allocating a register to j but not k, or spending any additional time in the "k" code segment to optimize the "j" code path, will not benefit performance. The best-performing compiler will put k into a register (if it has two), and leave j as a memory variable. I know that many compiler optimizers are quite sophisticated, and likely do a better job than a naive programmer putting "register" on a few declarations. In fact even an experienced programmer would not likely find a register allocation strategy that would do better than a good code optimizer over a range of architectures. My point is simply that due to lack of information, no optimizer can provide an optimal register allocation for best performance (i.e. speed) of a specific application and specific data inputs. The example cites a situation where a "statistical" guess as to the code's behaviour is likely to lead to wrong decisions by the optimizer. -- Jim Patterson Cognos Incorporated UUCP:uunet!mitel!cunews!cognos!jimp P.O. BOX 9707 PHONE:(613)738-1440 x6112 3755 Riverside Drive Ottawa, Ont K1G 3Z4