Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!purdue!mentor.cc.purdue.edu!pur-ee!hankd From: hankd@pur-ee.UUCP (Hank Dietz) Newsgroups: comp.arch Subject: Re: Dynamic frequency analysis (was Register usage) Summary: what static analysis looks like Keywords: compiler analysis, expected execution counts Message-ID: <11743@pur-ee.UUCP> Date: 24 May 89 15:23:49 GMT References: <966@aber-cs.UUCP> Reply-To: hankd@pur-ee.UUCP (Hank Dietz) Distribution: eunet,world Organization: Purdue University Engineering Computer Network Lines: 95 In article <966@aber-cs.UUCP> you write: >In other words, "used in a loop" is a good estimator of dynamic frequency >of use, and spares the optimizer from having to cache everything in sight >to get a good chance of caching all the variables that matter. "Used in a loop" is not the flip side to dynamic collection of stats; static analysis of expected execution count is. Where possible, the compiler should compute the exact number of iterations made by loops, etc. My PhD thesis, and other papers by other folk, show how to do this. It is only when the exact analysis is THEORETICALLY IMPOSSIBLE that an approximate scheme is used. For example: f() { {A} if (B) { {C} while (D) { {E} } if (F) { {G} } } else { {H} } {I} while (J) { {K} } {L} } In the above code, let's assume that no range information, etc., is available about the expressions B, D, F, and J. Then one might say that the expected execution counts are: code E(count) comments ---- -------- -------- {A} 1 actually 1 times E(function f) B 1 {C} 0.51 as per Multiflow, 0.51/0.49 then/else D 0.51*101 as per Multiflow, 100 iterations per while {E} 0.51*100 F 0.51 {G} 0.51*0.51 simple composition {H} 0.49 {I} 1 J 1*101 {K} 1*100 {L} 1 This generates the following partial ordering of regions based on expected execution count: J 101 {K} 100 D 51.51 {E} 51 {A},B,{I},{L} 1 {C},F 0.51 {H} 0.49 {G} 0.2601 Hence, despite having NO range analysis, it will be recognized that the while loop J,{K} is more important than the while loop D,{E}. Now, you might argue that you can easily create a program where this would be incorrect... but the point is that it is very rare that this analysis would mislead you AND getting a few dynamic statistics would correct the error. In other words, dynamic statistics are just about as unreliable as static guesses... it might be depressing, but that's how it is most of the time. Of course, the above expected counts can be further weighted by the expected spill costs, etc. In fact, a compiler doesn't allocate VARIABLES to registers -- it allocates VALUES to registers -- hence all these issues really are rather artificial. Interesting register allocation problems only arrise when the number of simultaneously live values which are good candidates for registers exceeds the number of registers available. This does happen, but not as often as you might think.... >Well, I may be sentimental, but I still prefer "register" and my own >judgment. On the other hand, of course, "performance-by-design" is a win >only if it is done. If not, an optimizer is just a fixup, but may be useful >("read carefully the instructions..."). Use register wherever possible -- it means "no address, no alias, and non-volatile." That can help the compiler understand your program better. As for you coding better than an optimizing compiler, well, congratulations, John Henry. ;-) -hankd@ee.ecn.purdue.edu