Path: utzoo!attcan!uunet!lll-winken!ames!mailrus!csd4.milw.wisc.edu!uxc!uxc.cso.uiuc.edu!uxg.cso.uiuc.edu!uicbert.eecs.uic.edu!wilson
From: wilson@uicbert.eecs.uic.edu
Newsgroups: comp.lang.smalltalk
Subject: dynamic compilation, tentative opti
Message-ID: <67500004@uicbert.eecs.uic.edu>
Date: 13 Jan 89 01:49:00 GMT
Lines: 216
Nf-ID: #N:uicbert.eecs.uic.edu:67500004:000:10312
Nf-From: uicbert.eecs.uic.edu!wilson    Jan 12 19:49:00 1989


I am looking for references to advanced techniques in dynamic compilation
and tentative optimization.  (Is anybody out there working on such things?
Is anybody doing anything beyond the Deutsch-Schiffman compiled-method
caching and inline caching of method lookups?)  In particular, I'm interested
in systems that compile specific *versions* of methods/procedures that
handle polymorphic arguments.

(Especially systems that keep more than one version around, and systems
that proceed to do interesting optimizations based on the particular
combinations of types that it actually encounters.)

The closest thing I've found to what I'm looking for is a paper by E.J.
Van Dyke in the Hewlett-Packard Journal in July 1977  ("A Dynamic Incremental
Compiler for an Interpretive Language," pp. 17-24). 

Van Dyke describes Hewlett-Packard's APL/3000, which compiles each
procedure the first time it encounters it.  It initially compiles a
very specific version, which assumes that the arguments will be of exactly
the same type next time (e.g., that array bounds will be the same).  The
The compiled code contains a short prologue of "signature code" that checks
to see if the assumptions are correct.  If not, the code "breaks" and calls
the compiler to recompile it.  It is then recompiled with fewer assumptions
(e.g., array type but not array bounds).

I posted my own ideas for dynamic compilation and tentative optimization
on the Scheme list a while back, which is how I got referred to Van Dyke.
I've tagged those postings onto the end of this one for anyone who's interested.
(They contain some basic references to dynamic compilation articles.  I'm
really looking for more advanced techniques.  The big question for me is
what has happened in tentative optimization in the last ten years.)

I'm told T.C.Miller's 1978 thesis from Yale is also relevant, but have not
received it from Yale yet.  It's called "Tentative Compilation: a Design
for an APL Compiler" (Yale U. DCS 78-CS-013, May 1978.)


-------------------------------------
First posting:
-------------------------------------
I've been thinking about using dynamic compilation for Scheme, and wondering
if anybody else is.  Have people pretty much abandoned the idea of dynamic
compilation, and if so, was it premature?

Some of the advantages might be:

   1) portable images could still run fast
   2) an optimizer could be called in to optimize those chunks of
      code that are ACTUALLY executed a bunch of times.
      (rather than just optimizing the time/space tradeoff between
       interpreted and compiled code, it could be used to adjust
       compile time/run time tradeoffs.)
   3) Transparent, automatic compilation and optimization.

   In particular, I'm thinking about a scheme in which inlining would
be done automatically, but dependencies would be recorded.  If somebody
changed the value of an inlined procedure or constant, the code
depending on those values would be invalidated and dynamically
recompiled at runtime.

   (The mechanism for this depends on near-lexical-scoping.  Procedures
that are never redefined in a lexically apparent way could be inlined.
Primitives that can modify things (like first-class environments) in
non-lexically-apparent ways would have to check for dependencies.)

   Anybody care to comment on this idea?  In particular, is there some
irreducible cost of dynamic compilation that I don't know about?  Any
advantages?

   (It seems to me that you might dynamically determine what types of
arguments frequently-executed procedures were frequently called with.
You could then compile special versions for the common combinations.
A large-grained check would dispatch to the proper version, within
which most type checking could be eliminated using type inference.
Would this be workable/worthwhile?)


----------------------------------------
second posting:
----------------------------------------
There seems to be some interest out there, so here's a bit more detail
on those dynamic compilation ideas:

*  When a chunk of code is dynamically compiled (not optimized yet), it
   is given a little prologue that increments a count every time it is
   executed.  When the count exceeds a threshold, the code is recompiled
   with a much higher level of optimization.  The most optimized code
   would not have this prologue to slow it down, since it wouldn't
   need it anymore.  Highly optimized code might be less subject
   to being discarded as well.

*  The business about noticing common types of arguments and compiling
   specially optimized versions could be handled similarly.  A little
   prologue would check every nth execution for types, rather than
   every execution.  Most times, it would just increment a counter.
   Very highly optimized code could forego the checking and counting
   entirely, assuming it had already learned all of the likely argument
   type combinations.  (Actually, every nth time it would change n
   a little to avoid rhythmic problems.)

*  Something like this might also be useful for trace scheduling.
   Every now and then you execute an "instrumented" version of the
   code that collects data on common execution patterns.

Any comments?  Does anybody know of any systems that use dynamic
compilation?  I know a couple of Smalltalks do it, and now
I'm told that MIT Scheme has a somewhat  similar cache-with-
dependencies mechanism for variable lookups.

Does anybody have any opinions on the expected efficiency of such a system?
It seems to me that it could be quite good.  It could also be advantageous
in that it would not optimize a lot of the code.  In particular, it could
keep things in a more debuggable form than something that optimizes
more stupidly, without much performance penalty.  (Especially since
frequently modified things will never get heaviliy optimized.)

I only have a few references on dynamic compilation, none of which go into
anything advanced.  If you have others, please send them to me and I'll
pass them on to interested parties.

(Besides the ones in Andy Gordon's posting, there was "An Evaluation
of Throw-away Compiling" in Software Practice and Experience, vol. 13
(1983), pp. 241-249.)

 [ NOTE: here are the citations from Andy Gordon's posting:
   
  [1] A. D. Gordon, "How to Breed Hybrid Compiler/Interpreters",
      Technical report ECS--LFCS--88--50, Laboratory for the 
      Foundations of Computer Science, Computer Science Dept.,
      University of Edinburgh, Edinburgh, EH9 3JZ, Scotland.
      April, 1988.

  [2] P. J. Brown, "Throw-away compiling", Software P & E, 6,
      pages 423--434, 1976.

  [3] J.R. Low, "Automatic data structure selection: an example and 
      overview," CACM, 21(5), pages 376--385, May 1978.

  [4] R. J. Dakin and P. C. Poole, "A mixed code approach", The Computer
      Journal, 16(3), pages 219--222, 1973.

 ]


I have not gotten any responses indicating that dynamic compilation has
been used to control levels of optimization or to recompile when
optimizer assumptions are violated.

  [ NOTE: after I posted this originally, I did find the two references
          mentioned above, thanks to Greg Jaxon at U. of Illinois. ]

---------------------------------------
third posting:
---------------------------------------
To motivate the discussion of dynamic optimization, here's a simple
example of why I think it might be worthwhile.

Suppose you have a language in which you can have homogeneous collections
like array-of-integer.  (Or an implementation that lets you declare them.)
Suppose also that the language is basically dynamically typed, and that
the declared types are attached to the objects at runtime.  So an array
of integer would have a header saying so.

  (That is, an array-of-integer will always contain only integers, but
you can't look at a particular point in the source code and tell what
class(es) an argument could belong to.  So some objects are "statically"
typed in that their components don't change types, but the code is
not amenable to static type analysis -- you can always call a procedure
with any kind of argument you want.  If types can be created at runtime,
then the set of possible types at many points in the code could be infinite.)

Now suppose we define a procedure that processes collections, looping to
apply some operation to its elements.  A dynamic optimizing system
would generate a generic version of the code that checks the types
of the arguments and then dispatches to an optimized version for those
types.  If it has never seen that combination of types before, a new
version is created on the fly.

So the first time this generic code sees an array-of-integer, it compiles
a version of the looping routine that assumes each element is an integer,
omitting tag checks, unrolling the loop, etc.

The generic code must always check the collection's type when it is
executed, but the compiled code it dispatches to needn't.  (If it's
not a homogeneous collection, it simply dispatches to a normal
version with type checking.)

The nice thing about this is that the code is still generic from the
programmer's point of view, and type checking is not abandoned.  (Unlike
systems in which declarations tell the compiler to make assumptions
at a particular point in the code.)

My guess is that for most programs, most routines are only ever given
one kind of argument, so there wouldn't be an explosion of versions.
(You can always use a cache and throw away infrequently used versions.)

At least for this example, the optimization would be transparent and
safe.  If somebody changes (at another point in the code) the type
of some object that is passed to the optimized routine, the dynamic
check will ensure that an appropriate version is generated.

Now the question is how general this system could be, using more
sophisticated type inference than recognizing homogeneous collections.
And can it be combined with nifty trace-scheduling techniques to 
precompute paths through often-executed chunks of code and seriously
optimize them?

-----------------------------------------------------------------------


Paul R. Wilson                         
Human-Computer Interaction Laboratory
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@uicbert.eecs.uic.edu
Box 4348   Chicago,IL 60680