Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!swrinde!cs.utexas.edu!rice!ariel.rice.edu!preston
From: preston@ariel.rice.edu (Preston Briggs)
Newsgroups: comp.benchmarks
Subject: Re: benchmarks and interprocedural optimization
Message-ID: <1991Mar30.174154.7169@rice.edu>
Date: 30 Mar 91 17:41:54 GMT
References: <1991Mar21.184428.8226@convex.com> <1991Mar26.175137.6706@digi.lonestar.org>
Sender: news@rice.edu (News)
Organization: Rice University, Houston
Lines: 53

metzger@convex.com (Robert Metzger) writes:

>>I will give you one data point.  One of the larger applications we have
>>compiled had 214,500 lines of FORTRAN source in 971 source files.  It took
>>a total of 5 hours and 26 minutes wall clock time to compile and link on

and blee@digi.lonestar.org (Benjamin W. Lee) writes:
>I am very much interested on which portion of the optimization is most
>time consuming? The data-gathering stage I suppose? Any good paper available
>on interprocedural optimization? Thanks!

I don't know details about Convex's compiler, so I can't comment directly.
However, this interprocedural optimization stuff need not be as expensive as
commonly imagined.  The approach developed at Rice (and used at Convex)
goes approximately like this

   1)	Collect summary information for every procedure
	(at Rice, we do it while editing, but there are other possibilities)

   2)	Build the call graph and
	calculate interprocedural data flow information.
	Determine which procedures must be recompiled (step 3).
	This step examines only the summary information,
	not the source code.

   3)	Compile individual procedures, supplying interprocedural
	information to the optimizer.

This decomposition enables practical program development and maintanence.
If you change a procedure or two, step 1 is run only for the changed
procedures.  Then step 2 is run for all the procedures in the program.
Finally, step 3 runs for the changed procedures, plus any procedures
that depend on changed interprocedural facts.

Part 1 occurs during editing and is very cheap.  The summary information
could be collected in a seperate pass; in this case, the expense would
be similar to a very fast 1 pass compiler (mostly syntax analysis).

Step 2 is actually similar to a super-Make.  The cost depends heavily
on implementation decisions.  The assymptotic complexity of the analysis,
however, is very low.

Step 3 closely resembles a traditional optimizing compiler.
The bulk of the overall compilation time should be consumed here.

There are many papers by researchers at Rice describing
all this in detail.  A good starting point might be

	A practical environment for scientific programming
	Carle, Cooper, Hood, Kennedy, Torczon, Warren
	IEEE Computer, 20(11), November 1987.

Preston Briggs