Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!mcsun!ukc!dcl-cs!aber-cs!athene!pcg
From: pcg@cs.aber.ac.uk (Piercarlo Grandi)
Newsgroups: comp.lang.misc
Subject: Re: Register Count
Message-ID: <PCG.91Jan14145154@odin.cs.aber.ac.uk>
Date: 14 Jan 91 14:51:54 GMT
References: <PCG.91Jan10162301@odin.cs.aber.ac.uk>
	<PCG.91Jan10194401@odin.cs.aber.ac.uk>
	<PCG.91Jan12152311@teachk.cs.aber.ac.uk>
	<1991Jan13.202053.22021@rice.edu>
Sender: pcg@aber-cs.UUCP
Organization: Coleg Prifysgol Cymru
Lines: 165
Nntp-Posting-Host: odin
In-reply-to: preston@ariel.rice.edu's message of 13 Jan 91 20:20:53 GMT


On 13 Jan 91 20:20:53 GMT, preston@ariel.rice.edu (Preston Briggs) said:

preston> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:

pcg> a[i], a[j] := a[m], a[n];

pcg> it involves *both* value, as maybe "i = j" but "a[m] /= a[n]", and
pcg> lvalue (if "j = m" or "i = n", ...) aliasing.

pcg> Three "solutions" are possible:

pcg> 1) [ ... assume aliasing will occur and use temporaries ... ]

pcg> 2) [ ... like 1), but encourage the programmer to use specific
pcg>    knowledge, if available ... ]

pcg> 3) A far more interesting solution is to add some cleverness to the
pcg> compiler. I mean, real cleverness, not like current "optimizers".

preston> and shows a solution that involves generating multiple versions
preston> of the assignment, with tests of i, j, m, and n to distinguish
preston> between the various versions.

preston> I don't like this for a several reasons.

I don't like it either :-). I prefer 1) or 2), as I had written fairly
clearly. Yet even 3) is defensible (and I will defend it ex officio),
and I added it for the sake of completeness. It is the more defensible
if the compiler is already implemented as symbolic reduction engine.

Some of your objections that follow look like objections to multiple
assignments in itself, not to compilers as symbolic reduction. The two
issues are totally independent; I was just illustrating that aliasing
resolution by generating multiple code sequences, a fairly fashionable
thing, can be done simply if the multiple assignment expander is
embedded ina symbolic reduction compiler.

preston> a) Where's the profit?  We save a register at best over the
preston> conservative solution in (1).  The cost though, is 1 or 2 tests
preston> at run-time.  A branch or 2 per register saved is no deal.

In this contrived example the benefit is next to nil and the cost,
especially in code space, is large.

But I read that with (much rarer!) more complex assignments and
expressions the ability to generate different code sequences and have
aliasing detected at run time is interesting, especially as the code
sequences compiled for the case where there is no aliasing can be
dramatically more efficient on heavvily pipelined machines (non general
purpose processors), and usually aliasing does not occur.

But this is an argument that applies to any flavour of runtime aliasing
resolution, whether produced by multiple assignment, by symbolic
reduction, or by both.

Moreover I am not persuaded that aliasing is really as bad as it is
described, for general purpose machines, but 3) offers a clean, clear
framework for dealing with it, where it matters (assignments), if you
think it is important.

preston> b) Tools already exist to help disambiguate array references,
preston>    e.g., dependence analysis.  While dependence analysis is not
preston>    always precise, it is conservative (safe)

Here indeed we get into the meat of the argument. The technology in 3)
is vastly simpler and more reliable as a consequence than any of the
analyzers that you mention, because it is strictly local (aliasing
detection is done at run time), and relies on the very clean and simple
semantics of multiple assignment.

The dependency analyzers that you mention not only have to detect
aliasing at compile time with a global view of the program, an often
impossible task, they have to do double work; they have, in essence, to
take a sequence of single assignments, reconstruct the multiple
assignment that is equivalent to the sequences, do the global analysis,
and then reexpand everything again. Compare:

A) the programmer writes a multiple assignment without caring about
order of execution or aliasing or data dependencies; the code generator
expands the multiple assignment once and for all deferring alias
detection at run time and uses a small algorithm that can be proven to
work for all possible cases to generate the various alternatives.

B) the programmer writes a sequence of single assignments that implement
the multiple assignments, making sure that the sequence is one of those
equivalent to the underlying multiple assignment, and taking care of
rsolving manually dependency and aliasing issues for that particular
sequence. The "optimizer" undoes all this, reconstructs the original
multiple assignment and its characteristic graphs, analyzes the whole
program hoping to resolve aliasing statically, and then reexpands the
multiple assignment using ad-hoc technology that needs constant
debugging.

preston>     and the technology is improving all the time.

Why bother improving an hacked up technology when you have a ready made,
simple, dependable alternative? Dusty decks, that's the only possible
answer.

preston>    Further, if you've taken the time to build a dependence
preston>    analyzer,

I'd rather spend one tenth of the time to implement a fast and furious
implementation of a simple technology that has important theoretical and
practical underpinnings than to hack up an "optimizer" and dependency
analyzer and spend the next few years debugging it :-).

preston>    you can use it to support more extensive and more profitable
preston>    transformations.

No, because here we are comparing (if I read you correctly) a global
analyzer that tries to resolve aliasing statically using global
analysis, and generates pessimistic code if it cannot, with one that
does only a local analysis and leaves alias resolution to run time and
generates both optimistic and pessimistic code.

Please note however that I think that local static analysis and
pessimistic code generation are usually better than either, because they
are simpler and nearly as good in a vast majority of cases.

preston> and finally

preston> c) I don't like so much special-purpose machinery (compiler code)
preston>    devoted to one programming language construct.

Hey, this is one of my lines; I hate big compilers.

As to the "one programming language construct", almost all time is spent
in assignments and expressions, and it is my contention that multiple
assignment captures essentially all the interesting cases. I cannot give
you much "proof" for this contention, yet; I appeal to your experience.

preston>    I'm much prefer general-purpose optimizations that help over
preston>    the entire language.

As in the above lines, there are reasons to prefer local dynamic
aliasing resolution to global static analysis, in simplicity and
efficiency.

However I am myself skeptical that symbolic, multiway expansion of
multiple assignment is a win for general purpose computers, but I
concede that there are good arguments that it can be important in some
cases (I am thinking of inlining and similar things, and improving
locality and pipelinability by offlining complex and rarely executed
cases); I am just poiting out that it can be done elegantly and cheaply
by embedding the multiple assignment expander in a symbolic reduction
compiler engine.

So we agree in that I would not choose 3) over 1) or 2) except in
special circumstances; we disagree in that I don't think that ad-hoc,
global, static data dependency and "optimization" is a cost effective
alternative, except for "dusty decks".

If one designs a language appropriately, a lot of "optimization" is
simply not needed. The one example I have advanced is designed to show
that with multiple assignment one can do a lot of scheduling and
aliasing resolution without requiring complex analysis by the programmer
and by an "optimizer" that second guesses the programmer. An "optimizer"
that second guesses the programmer is also not needed for optimal alias
resolution, if one goes the dynamic route, which has its attractions.
--
Piercarlo Grandi                   | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk