Path: utzoo!attcan!uunet!mcvax!ukc!dcl-cs!aber-cs!pcg
From: pcg@aber-cs.UUCP (Piercarlo Grandi)
Newsgroups: comp.arch
Subject: Re: Dynamic frequency analysis (was Register usage)
Summary: A good heuristic for estimating frequency of use is a Good Thing.
Message-ID: <966@aber-cs.UUCP>
Date: 21 May 89 11:35:48 GMT
Reply-To: pcg@cs.aber.ac.uk (Piercarlo Grandi)
Distribution: eunet,world
Organization: Dept of CS, UCW Aberystwyth
	(Disclaimer: my statements are purely personal)
Lines: 66

In article <664@cbnewsh.ATT.COM> beyer@cbnewsh.ATT.COM (jean-david.beyer) writes:

    One of our optimizers allocates registers by estimating useage. To handle
    loops, it assums each loop is executed a small number of times. This does
    much better register allocation than I do manually.

If you say so.... :-) :-) (Sorry, but I could not resist -- I am actually
prepared to believe that it could even beat me...).

    It does not use profile data. 
    
    I once modified the optimizer to use the true count, when it could be
    determined (many loops are done a constant number of times that can
    be determined at optimization time, especially in benchmarks :-)   )
    This did not improve the execution time of any of the benchmarks tested,
    beyond what assuming they were executed a small number of times.

I am prepared to believe you, and you have made a good contribution to
optimizer technology (even if, let me dimly remember, Gries in the ancient
"compiler construction for digital computers", noted that the legendary
Fortran H did weigh more heavily variables appearing in loops).

My interpretation: the Wall paper essentially says that while coloring may
be good (as per a very interesting letter I received, especially if a
different DAG traversal order is used) at minimizing spills, it is no good
at intuiting which variables matter; indeed, let me add, it tends to keep a
variable in a register based on how many it appears statically in the text
of a program (spill minimization), which may or may not have any correlation
with how often it is used.

Typically the variables for which static vs. dynamic use is least correlated
are those in loops (which "ironically" are those that matter most). In other
words, what really matters is caching preferentially the variables in "most
frequently executed loops". A weaker strategy, that does not require dynamic
info, is to cache variables in "any loop"; an even weaker one is to cache
variables, period.

    I thought about using execution-time profiles to improve optimization,
    but that just opened up the can of worms of determining what "typical
    data" to use for the profiling run. At least, MIPS do this.

What you are telling us is that simply caching variables in "any loop" is a
big win, and (almost?) as good as selecting those in the critical ones; this
means that caching variables in the less critical loops is not as big a loss
as caching based on static analysis. This is not surprising, in hindsight;
there may be at least two reasons:

	[1] If loops are non interfering in a dataflow sense, we can reuse
	    the same registers for them, and not waste any. I suspect this
	    is quite common.

	[2] If a loop is executed even a few times, caching its variables
	    is a win nonetheless.

In other words, "used in a loop" is a good estimator of dynamic frequency
of use, and spares the optimizer from having to cache everything in sight
to get a good chance of caching all the variables that matter.

Well, I may be sentimental, but I still prefer "register" and my own
judgment. On the other hand, of course, "performance-by-design" is a win
only if it is done. If not, an optimizer is just a fixup, but may be useful
("read carefully the instructions...").
-- 
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcvax!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk