Path: utzoo!utgpu!attcan!uunet!lll-winken!lll-tis!ames!oliveb!Ozona!chase
From: chase@Ozona.orc.olivetti.com (David Chase)
Newsgroups: comp.arch
Subject: Re: Execute instructions, Bit Blit, Self-modifying code, Programming in Assembler
Message-ID: <26253@oliveb.olivetti.com>
Date: 28 Jul 88 17:59:55 GMT
References: <787@amethyst.ma.arizona.edu> <4235@saturn.ucsc.edu> <681@auvax.UUCP> <26195@oliveb.olivetti.com> <3435@polya.Stanford.EDU> <6412@aw.sei.cmu.edu>
Sender: news@oliveb.olivetti.com
Reply-To: chase@Ozona.UUCP (David Chase)
Distribution: na
Organization: Olivetti Research Center, Menlo Park, CA
Lines: 104

[First, though I talk about this, credit is due to other people.  I
imagine the Lisp crowd has known about this for ages, and Thomas
Breuel pointed it out to me.]

In article <6412@aw.sei.cmu.edu> firth@bd.sei.cmu.edu.UUCP (Robert Firth) writes:
>The cost of passing the environment pointer to a parametric procedure
>is rarely more than one instruction.  I grant you this is wasted if
>the procedure being called doesn't need it, but surely there is a
>minimum overhead of one instruction in getting from the runtime-
>generated code back to the "real" code of the called procedure?
>
>And, of course, when you generate the actual procedure parameter you
>don't need to evaluate its environment pointer if you can see it won't
>need one; likewise the entry sequence of a procedure can ignore any
>passed-in evvironment pointer.

I think the other (major) motivation for doing this (of course, the
one that I omitted) is to simplify interlanguage stuff.  In most
implementations of Pascal, Fortran, and C, a "function" or
"subroutine" variable (such as they are permitted at all) is
represented as a pointer to code, except for the case of (correct me
if I am wrong here) functions and procedures passed as parameters in
Pascal (*).  In that case, because the function might be nested
(inherit variables from its lexical ancestor) a pointer to the frame
(or similar structure) must be passed along with it (unless code is
generated as a I described in the previous posting).  This hurts you
in the following ways.

1) a (possibly bogus) frame pointer gets passed with EVERY procedure
passed as a parameter, whether it needs it or not.

2) ALL calls to a procedure passed as a parameter must set up the
lexical scope in case it is needed.

3) I can't pass nested functions or procedures passed as parameters to
other languages (e.g., Fortran ODE solvers or minimum finders, if you
need justification for this).  In fact, unless I tell my compiler that
"this is an interlanguage call", I can't pass ANY procedures as
parameters to subroutines written in other languages, because the
other languages won't be expecting to see that (possibly bogus) frame
pointer passed along with the procedure in the parameter list.

Given that many procedures are not nested, and given that many people
have an aversion to lexical procedures anyway (though some people love
them), I can't see how any minor speedups for the nested case justify
penalizing all other cases and hindering interlanguage calls so
severely.

As to the costs -- good code for doing this would be (on a 68k)

1) save the current LP (lexical pointer) if necessary.
2) load the one for the routine about to be called (using PC-relative
   arithmetic)
3) branch to the routine
....
4) The routine would restore the LP before returning.

This costs you a branch, I think, for each call of the routine, plus
the time to "generate" the code on entry to the lexical ancestor (one
block move plus a store).  Of course, if the nested procedure is not
passed as a parameter, then the ancestor procedure can just load the
LP and branch directly to the routine, removing the need to generate
code on the stack.  For a discussion of this and many other related
tricks, one might track down the Orbit papers in the '86 compiler
construction conference, or Guy Steele's work on the Rabbit compiler, 
or Luca Cardelli's work with his ML compiler.

Let me say it again, by cases:

1) non-nested procedure parameter
   pass as pointer to code.

2) incoming procedure parameter
   treat as pointer to code.

3) nested procedure, not passed as parameter. (+)
   treat as code + pointer, since all occurrences of the procedure
   are exposed to the compiler.

4) nested procedure, passed as parameter.
   Generate code on the fly, and use that for the parameter
   occurrence, but just load the LP and go for calls to the nested
   procedure from within the nesting procedure.

So, on-the-fly code generation only costs you when a nested procedure
is passed as a parameter.  I don't have statistics handy, so my
guesses are suspect, but I'll guess that that doesn't occur too often.

Of course, all the cost estimates change radically if I'm on a machine
with split instruction and data spaces, or some similar impediment.
That's what I was talking about in the first place, and what I wanted
to know about.  How is on-the-fly code generation handled in some of
these split cache/split bus architectures that seem to be appearing
nowadays?  Is it very costly, or even possible?  (and if so, how?)

(*) My Pascal knowledge is pretty wimpy, but I didn't see any big
block letters saying "don't pass nested procedures as parameters".

(+) This gets tricky if (say) A contains B and C, and B calls C, and B
    is passed as a parameter.  However, from within B it will be
    possible to recover the lexical environment for C, so we are OK
    (in fact, in this case the LP is the same for B and C).

David Chase
Olivetti Research Center, Menlo Park, CA