Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!cs.utexas.edu!wilson From: wilson@cs.utexas.edu (Paul Wilson) Newsgroups: comp.lang.scheme Subject: Macros again (long) Summary: clarifications Keywords: macros, hygeinic macro expansion, extend-syntax, lexical scope Message-ID: <1510@yoakum.cs.utexas.edu> Date: 3 Jun 91 16:29:59 GMT Organization: U Texas Dept of Computer Sciences, Austin TX Lines: 206 It appears that I should clarify a lot of what my previous postings said. First, when I referred to lexically scoped macros, I did indeed refer to (roughly, at least) hygeinic macro expansion. (More on this below.) Second, when I said that standardized macros are the easiest-to-fix barrier to portable Scheme code, I meant that at two different levels: 1. It looks like hygeinic macro expansion is getting to be mature and well-understood enough that at least the most basic functionality could be standardized with a standard syntax, now. Fancier functionality could be standardized later. 2. Even if that's not the case, it is past time to standardize some kind of basic macro functionality. If necessary, leave it undefined whether they're hygeinic or not. For a lot of simple but unimportant macro uses, it's okay (more on this below, too). Now for a little more detail... The reason that I refer to hygeinic macro expansion as lexically scoped is that, as I understand it -- correct me if I'm wrong -- there is a very close analogy between the scoping of syntax bindings and the scoping of variable bindings. Not only that, but given Scheme's unified naming principle, it's the *same* scoping principle in a deep sense. The "painting" or "renaming" of variables is really just a mechanism for implementing multiple syntactic scopes. The more straightforward way of doing this would be to "close" syntax bindings in syntactic environments, using an actual data structure. Not only that, but those syntactic environments are most straightforwardly implemented using normal variable binding environments with a couple of added features. Instead of using raw symbols as names for syntactic constructs, you'd use a symbol closed in a syntactic environment, in very much the same way that you use a closure instead of a raw procedure. (This differs from my previous posting in which I talked about compile-time environments, roughly a la the ones in Abelson and Sussman's Scheme-in-Scheme compiler. Forget that for now -- it's even simpler if you just have an interpreter.) An example may make this a whole lot easier to understand. Suppose we have a form like this: (let ((foo 1)) (let-syntax ((bar )) (let ((baz 2)) ))) The straightforward way to interpret this is to create a binding contour (extend the binding environment) when the first let is entered. That extends the prior environment with a binding for foo. Similarly, when the let-syntax is entered, a new binding contour is created, binding bar. But since this contour is syntactic, it has a couple of special properties -- for one, the value is immutable, and for another, the value is used specially. But still, it's a binding environment pretty much like any other. And when we enter the inner let, another contour is created for baz. If we implement binding environments in the obvious way, using a scoping chain, then by the time we've gotten to the , our binding environment looks like this [foo:1] -> [bar:] -> [baz:2] -> When hygeinic macro expansion "paints" or "renames" a symbol, that's really just making a unique name that specifies a symbol _bound_ _in_a_particular_environment_. So if I rename the variable "bar" as "bar-2" or whatever, what that's really saying is _this_particular_ _bar_in_this_particular_environment. The use of renaming is just a technique for specifying bindings that happens to be convenient when you're implementing macro-expansion as a prepass. But the more straightforward, Scheme-like way is to integrate it into the interpreter and actually use an actual reference to the binding environment to "paint" symbols introduced in that environment. I'm not necessarily suggesting that syntactic binding environments should be implemented this way, rather than using renaming, but it's conceptually simpler and clearer. And it avoids needing a prepass that does all of the front-end work an extra time -- it's all one scoping mechanism, so it may as well all be done in one pass, right? When it comes to compilers (instead of interpreters), these syntactic binding contours would be present in the compile-time environments, but not in the actual runtime environments. (I extended the Scheme-48 compile-time environments in an analogous way to handle loop labels; you just ignore the syntax contours when computing lexical addresses, since they won't be reflected in the runtime scoping chain. Other than that, they play pretty much the same role in name resolution as normal contours -- one namespace, one scope chain. This experience is what prompted my comment about any decent Scheme compiler being hairy enough that supporting syntax contours is no biggie. Just have three kinds of links in the compile-time scope chain, rather than two.) My comment about "reifying" syntactic environments was a bit careless. All I really meant was that you should have some kind of hook into the scoping mechanism, if you want to make it easy for people to implement macro mechanisms in the language. Maybe pattern-matching (or whatever) can be optional, but you should be able to "get at" the scope mechanism somehow or other. (Using the approach above, where symbols are renamed by pairing with environment objects, the problem of generating unique identifiers is solved in the most straightforward way -- syntactic binding environments are first class objects with identities in the compiler code. Whether they should literally be first class in the language is something else.) ------ About Scheme as Lisp Lite vs. Lisp Right: it seems to me that getting macros right is not going to make a Scheme system all that much bigger, and if you're worried about that increment in size, you're probably majoring in the minors. A bigger worry is garbage collection -- unless you have a generational gc, your locality is going to be the absolute pits. Especially if you don't have a stack cache, and you really heap-allocate all of your continuations and environments. If you use a simple gc, you're going to use a lot of memory, or page a lot, or incur a lot of gc cost. If that's not true in your system, it's probably because your system is really, really slow, and that hides the gc cost. But maybe really, really slow is okay in an extension language interpreter. Also, the Lisp Lite vs. Lisp Right controversy might be considered to be a conflict between the Lisp 1.5 heritage and the Algol 60 heritage. Lisp Right (hygeinic) macros fit in with the Algolic idea of scoping and abstraction, where Lisp Lite (traditional) macros retain the idea of S-expressions as being interpretable data structures. Hygeinic macros get away from viewing programs as data structures that you can arbitrarily hack around with and execute. It's a move towards a program representation that has more sophisticated (and restrictive) structure, but is better suited for writing structured programs. It seems that the committee is generally more enamored of the Algol-60 design philosophy than the Lisp 1.5 philosophy. On the other hand, if syntactic environments were really first-class, you could probably unify the two philosophies; maybe you could even write the pattern- matching facilities for macros cleanly in the language. (?) ------ About least-common-denominator macros: the thing that spurred my original posting was that I was fixing up Chris Hanson's balanced binary tree code to run efficiently in a fairly simple Scheme system. Chris' code takes advantage of a couple of MIT Scheme features, in particular define-integrable. A common use of define-integrable is in defining access procedures for data abstraction. I didn't want to replace those uses with simple defines for efficiency reasons -- I wanted to come up with a portable version of Chris' code that was still very efficient, if not *quite* as pretty. The tried and true Lisp way of doing this is to use a macro. So all of those access procedures could be changed to macros with no efficiency cost. And the macros would expand into things that a small Scheme compiler like Scheme-48 is smart enough to inline. So my code would only be a shade uglier than Chris', but portable and fast. This is what motivates me to say that some least-common-denominator macro could be defined now, and enhanced later. If I can write a portable macro equivalent to (define-integrable (bbt-node-parent node) (vector-ref node 3)) I'll be a lot happier. Just standardizing the *syntax* for this kind of trivial macro would be helpful, even if it's not specified how free variables are treated. (If my code is the only code that uses bbt-node-parent -- because it's not an interface to the ADT -- I can make sure that it's safe because none of the code does anything weird with the variables referenced in the body.) This is enough to let me write code that's efficient, portable, and pretty _enough_ that nobody would ever have to go back and clean it up or upgrade it -- it would be suitable for a library routine. That's why macros are necessary -- to patch over some of Scheme's other deficiencies in terms of data abstraction, etc. Sure, it's a quick-and- dirty way of solving problems, but it can be done reasonably cleanly if you're careful. We need to be able to write decent, efficient, reusable code soon, rather than waiting until Scheme is complete and perfect. (And the example above is particularly relevant to a Lisp Lite Scheme system -- when you can't count on a fancy compiler, you need macros for speed-critical code. For a Lisp Right system, the mechanism for lexical scope of macros will also support things like define-integrable.) -- Paul P.S. I don't claim to be an expert on any of this, especially on the technical details of macro expansion; I'm just a user who wants his macros. Feel free to correct any misconceptions, etc. -- | Paul R. Wilson, Interactive Computing Envts. Lab. lab ph.: (312) 996-9216 | | U. of Illinois at Chicago EECS Dept. (M/C 154) wilson@bert.eecs.uic.edu* | | P.O. Box 4348 Chicago,IL 60680 fax ph.: (312) 413-0024 | | *Next yr, Asst Prof, UT Austin CS Dept-- after 8/1 use wilson@cs.utexas.edu |