Path: utzoo!censor!geac!torsqnt!lethe!yunexus!ists!helios.physics.utoronto.ca!news-server.csri.toronto.edu!cs.utexas.edu!rice!news
From: preston@ariel.rice.edu (Preston Briggs)
Newsgroups: comp.arch
Subject: Re: What the compiler won't do you for you
Message-ID: <1991Feb28.181948.26958@rice.edu>
Date: 28 Feb 91 18:19:48 GMT
References: <8787@exodus.Eng.Sun.COM> <24280:Feb2802:55:4991@kramden.acf.nyu.edu> <8808@exodus.Eng.Sun.COM>
Sender: news@rice.edu (News)
Organization: Rice University, Houston
Lines: 126

> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>>I found out from e-mail that essentially the same ideas
>>have been published before, in many separate papers.
>
>References, please.  I'd love to be enlightened.

Good ref's that many people should read include papers describing
the ECS project undertaken at IBM Yorktown in the mid-70's.
ECS was Experimental Compiling System.  Very roughly, it
worked like this:

	They defined an intermediate language (IL).
	This was fairly interesting by itself, and rather more
	extensive and complete than the norm.

	They defined (built?) an extensive IL optimizer

	Front-ends produced IL
	Libraries were kept in IL form

	There was the potential for extensive inlining
	with reoptimization (and more inlining, ...)

	New operations could be added by providing appropriate IL
	definitions.  With adequate inlining, constant propagation,
	type propagation, etc... these new operations could be
	extensively optimized (examples included string concatenation,
	which is trickier than many imagine).

	I have only a few references.

	"A Case Study of a New Code Generation Technique for Compilers"
	J. L. Carter
	CACM, 1977

	"A New Strategy for Code Generation -- 
		The General Purpose Optimizing Compiler"
	W. Harrison
	IEEE Transactions on Software Engineering, 1977

	"The Experimental Compiling System"
	F. Allen, Carter, Fabri, Ferrante, Harrison, Loewner, Trevillyan
	IBM Journal of R & D, November 1980

Unfortunately, the project apparently died after a few years.
Why?  I dunno.  Perhaps the success of the PL.8 work.  Perhaps
it was too complex.  Perhaps it didn't work.  Perhaps funding or politics.

Personally, I was frightened by the complexity of their
intermediate language.  I also wonder how to control
the huge potential inlining.  Can compile-times be controlled?

I think many people suggesting various forms of compiler extensibility
can learn a lot by reading theses guys.  In particular, the case study
paper illustrates how operations can interact and how difficult it
can be to optimize the results.

>>I still find it amazing that David can draw a line between optimizing
>>*(p+a) into a single instruction on a Convex, optimizing c = a/b and
>>d = a%b into a single instruction on most CISC chips, and optimizing a
>>short Hamming weight function into a single instruction on a Cyber.

There are significant differences between the three cases.

	The 1st is simply instruction selection based on a single tree.

	The 2nd is much trickier.  The 2 instructions don't share a common
	root, so we don't have any easy place to start looking (for
	efficient search).  They do share operands, which can be used as
	a big hint.  Hence the commonality can often be detected with a
	simple extension to value numbering (like Chase suggested).

	The extension many people are arguing for:

		c, d = a divrem b

	is much more significant (at least to languages like C or Fortran).
	Basically, you've suddenly defined a new language with significantly
	different syntax.  The front-end of the compiler will probably
	have to be chopped up pretty extensively, though the optimizer
	and code generator might survive unscathed.

	Better than a massive hack job would be to look into languages
	that allow various forms of multiple-assignment
	(Mesa, Beta, Lisp, Clu, others?).  Do some reading!

	The 3rd is much harder again.  Recognizing all the ways
	I might code a particularly complex function as being equivalent
	to some bizzarre instruction is quite difficult -- probably
	impossible.

	It's also wrong-headed, for many reasons.  Chase pointed out
	that there's only a small payoff.  Perhaps he didn't make his
	argument clearly enough.  Think of all the things that are wrong
	with some particular compiler, and prioritize the list.
	First is usually that it produces incorrect code in some
	important case.  It also compiles to slow, uses too much memory,
	and produces slow code.  Why is the object code slow?
	Well, we didn't do good enough strength reduction and
	the value number has bugs.  Our dependence analysis
	can handle subscript expressions that complex, and we
	don't know how to block loops for the TLB yet.
	Or whatever.  These matter a lot and occur over and over
	in everyone's code.

	And there you sit, working on that special hamming-weight
	function recognizer so Preston's code will run faster.
	Besides being a generally undecidable problem (program equivalence),
	Preston suddenly decides to change his code a little
	or the architects redefine how the special hamming-instruction
	works in the first place (say, how one of the condition
	code bits is set).

	Besides, is the instruction faster than those cute little
	RISCish instructions anyway?  Especially when the optimizer
	realizes that you've already done half a hamming-thing earlier
	and that it needn't repeat some of the instructions.

>>I give up.

Good idea.  One of the cool things about the net is that it provides a way
we can ask questions and get quick answers.  For example, I can ask
about the MIPS I-cache and usaually get a precise answer from someone
who knows.  It isn't helpful to tell John Mashey he's wrong about
how his machine works!  Similarly, you aren't going to win arguments
with David (or Ben) Chase about how compilers work.