Path: utzoo!utgpu!news-server.csri.toronto.edu!clyde.concordia.ca!uunet!samsung!usc!snorkelwacker!spdcc!ima!esegue!compilers-sender
From: preston@titan.rice.edu (Preston Briggs)
Newsgroups: comp.compilers
Subject: Re: Intermediate Representation
Keywords: parse, design, optimize
Message-ID: <1990Aug12.211939.1128@rice.edu>
Date: 13 Aug 90 01:56:58 GMT
References: <1990Aug09.180627.18848@esegue.segue.boston.ma.us> <1990Aug9.225859.10175@rice.edu> <1990Aug12.183239.11205@esegue.segue.boston.ma.us>
Sender: compilers-sender@esegue.segue.boston.ma.us
Reply-To: preston@titan.rice.edu (Preston Briggs)
Organization: Rice University, Houston
Lines: 119
Approved: compilers@esegue.segue.boston.ma.us

I wrote:
>A more general algorithm (say Cocke and Kennedy) would perform strength 
>reduction over any loop (while loops, loops built out of if's and goto's), 

Chris Smith replied:
>These algorithms *are* tree based, if not AST based

I disagree.  They're based on directed graphs.

> -- the first step is to do some control flow analysis

over the control flow graph, right?

> to write the program as a sequence of properly nested loops.  You can do
>the same optimizations on the loops no matter how you discover them.

Remember, I was arguing about Ottenstein's paper, which had an explicit
DO-loop header node, with subnodes that distinguished the induction variable,
loop body, and termination conditions.  My point was that such an
approach, while simple, is not general.

A sub-point is that the loops are discovered in a an explicit pass over 
the control flow graph (which is built in an explicit pass over the IL), 
not while parsing.  AST's are built while parsing, though they aren't
the same as parse trees.  That's why there's an "S" for Syntax in AST.

Parsing is local.  It's hard to discover global facts (like the connection of
goto's and labels) while parsing.

>Gotos that don't jump into statements from outside are pretty easy too

I don't agree that even the "simple" goto's are easy.  Break's and
Continue's are easy since they're bound to enclosing loop structures.
Imagine you're parsing along (we'll say Pascal), building your AST, and you
see a goto...

	while i < limit do begin
	    i := i + 1;
	    x := a[i];
	    if x = 0 then goto 10;
	    ...
	end;

what do you do?  Is it a tiny local branch that describes a little
if-then-else?  Is it a giant break statement (perhaps out of several levels
of loop)?  Is it a backwards branch describing a loop?  Is it a forwards
branch describing part of a loop?

And what are you going to do about Fortran?

>It runs in linear time, and it does each thing that needs doing (and/or
>bit vectors, scan through them) exactly once.  Probably faster.
						^ eh?

If you've got a language that restricts you to reducible routines (not a bad
thing!), then you can do data flow analysis in a linear number of bit-vector
steps.

This is indeed quick (and it's also simple), but it isn't general since
you can't handle irreducible routines.  You could do node splitting,
but that's potentially exponentional.

>You have a hold on both ends of an interesting region.  Say, in
>
>	if ... then
>	   ... a + b ...
>	end if
>	... a + b ...
>
>where the if doesn't modify a+b and you're allowed to raise exceptions
>prematurely.  You can insert a+b before the if.  Partial redundancy
>elimination won't.  (It would insert a+b into an else branch, but there
>isn't one.)  

Use one of the edge placement algorithms,
such as

	A Fast Algorithm for Code Movement Optimisation
	Dhamdhere
	Siplan Notices, Volume 23, Number 10

or

	A Solution to a Problem with Morel and Renvoise's
	`Global Optimization by Suppression of Partial Redundancy'
	Dreschler and Stadel
	TOPLAS, Volume 10, Number 4 (1988)

>Iterative algorithms have info about one end of a region
>whose other end is implicitly the procedure entry or exit, or sometimes
>the enclosing loop entry or exit.  Tree-based algorithms extend this.
>
>In
>	loop
>	 if ... a + b ... end if ...
>	end loop
>
>If a+b is loop invariant and doesn't raise exceptions or you don't care,
>you can hoist it.  You might not want to if it's under switch rather than
>if.  With trees, you can.

I'm a little baffled here.  You seem to be arguing iterative analysis vs.
some sort of interval-based analysis.  Maybe that's what you intended and
I've misunderstood the point of your article.  We could certainly use one
form of interval analysis or another if desired, though there is always the
ugliness of irreducible flow graphs.  Node splitting is a solution, but it's
a lot of code for me to write when I no a simpler way.

At any rate, partial redundancy elimination wouldn't hoist a+b in the
example shown since it isn't always executed (the IF condition may never be
true).  It's a conservative design decision, but doesn't reflect on the rest
of the discussion that I can see.

-- 
Preston Briggs				looking for the great leap forward
preston@titan.rice.edu
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus| world}!esegue.  Meta-mail to compilers-request@esegue.