Path: utzoo!attcan!uunet!ncrlnk!ncrcae!hubcap!gatech!bloom-beacon!CHAMARTIN.AI.MIT.EDU!jinx From: jinx@CHAMARTIN.AI.MIT.EDU (Guillermo J. Rozas) Newsgroups: comp.lang.scheme Subject: Y, again (long) Message-ID: <8812151447.AA06672@chamartin.ai.mit.edu> Date: 15 Dec 88 14:47:30 GMT Sender: daemon@bloom-beacon.MIT.EDU Reply-To: jinx@zurich.ai.mit.edu Organization: The Internet Lines: 231 The recent flurry of messages about the Y operator has prompted me to spend yet a little more time on the MIT Scheme compiler. The most recent version (4.33) does a good job on some versions of the Y operator, but not on all. This message should clarify what it can currently do and what it cannot. I should point out that there is no explicit "Y-recognizer" in the compiler, but rather the results are a consequence of various of its parts: - value analysis (which values can reach what variables). In particular, a variable's value is "known" if only one value can reach it. - closure analysis (which lambda expressions need to be represented by actual objects at run time, instead of having an implicit representation in the code stream) - a limited side effect analysis which allows the compiler to "punt" some calls when the callee has no side effects and the value returned is unused or known and available at the call point. This is the piece of code that I've written recently to make the examples below "work". Some of my thoughts about the points raised in recent messages: As far as using some version of Y to implement LETREC, we could probably start doing it and users would hardly ever notice the difference, but aside from conceptual elegance nothing would be gained, so it's not really worth it in a "real" compiler. Why have I spent any energy getting the Y operator to "work"? From a compiler writer's point of view, I'm not interested in the Y operator per-se, except as a "cute" hack. It just happens to be a very good test case for value and closure analysis, and if the compiler can do a good job on code that uses it, it will hopefully also do a good job on the "simpler" cases that occur in practice. About eq?/eqv?-ness of procedures, I should say that the only time we may need to distinguish multiple procedures whose "code" is the same lambda expression is when they can coexist in time, and free variables can cause the different instances to behave differently (return different values and/or perform different effects). Clearly, any correct compiler cannot "merge" two such instances. The rest of the cases are for philosophers to argue about. To make them happy, compiler writers may have to provide a "pedantic" switch in the compiler to prevent merging of otherwise identical procedures. I'm not saying that determining when two procedures "act the same way" is feasible (we all know about the halting theorem), but if we are able to determine in some instance that the only thing that could discriminate between two procedures is eq?/eqv?, there is really no reason not to merge. Here are some examples of what the MIT Scheme compiler can currently do: The following versions of factorial turn into identical (bitwise) code: ;; version 1, using letrec. reference version (define fact (letrec ((fact (lambda (n) (if (= n 0) 1 (* n (fact (- n 1))))))) fact)) ;; version 2, "subroutinized" "standard" applicative order (define fact ((lambda (f) (let ((g (lambda (g) (f (lambda () (g g)))))) (g g))) (lambda (fg) (lambda (n) (if (= n 0) 1 (* n ((fg) (- n 1)))))))) ;; version 3, "standard" normal order (define fact ((lambda (f) ((lambda (g) (f (g g))) (lambda (g) (f (g g))))) (lambda (fact) (lambda (n) (if (= n 0) 1 (* n (fact (- n 1)))))))) Note that version 3 "has no right to work" in Scheme, but I've never found anything wrong with compilers that "fix" users' code. The following two versions turn into code essentially identical to the code generated for the versions above, with the differences described below. ;; version 4, another "subroutinized" "standard" applicative order (define fact ((lambda (f) (let ((g (lambda (g) (f (lambda (x) ((g g) x)))))) (g g))) (lambda (fact) (lambda (n) (if (= n 0) 1 (* n (fact (- n 1)))))))) ;; version 5, "inline coded" Y (define fact (let ((kernel (lambda (p n) (if (= n 0) 1 (* n (p p (- n 1))))))) (lambda (n) (kernel kernel n)))) The code generated for version 4 differs from the code in versions 1-3 only in the order in which the basic blocks appear in the linearized output. The code for version 5 differs because the compiler is currently not smart enough to eta convert (lambda (n) ...) into (lambda (p n) ...) after it determines that p in (lambda (p n) ...) is used for "self reference" and unneeded (and dropped). Thus there is an initial call (branch instruction) from the code generated for (lambda (n) ...) to the code generated for (lambda (p n) ...), but the actual "recursive" procedure is coded identically to the other cases above. As an example of a hairier case where the compiler "wins", it generates identical code for (define (delq! item items) (letrec ((trim-initial-segment (lambda (items) (if (pair? items) (if (eq? item (car items)) (trim-initial-segment (cdr items)) (begin (locate-initial-segment items (cdr items)) items)) items))) (locate-initial-segment (lambda (last this) (if (pair? this) (if (eq? item (car this)) (set-cdr! last (trim-initial-segment (cdr this))) (locate-initial-segment this (cdr this))) this)))) (trim-initial-segment items))) and (define delq! (lambda (item items) (((lambda (b f g) ((lambda (k1 k2) (b (lambda () (k1 k1 k2)) (lambda () (k2 k1 k2)))) (lambda (k1 k2) (f (lambda () (k1 k1 k2)) (lambda () (k2 k1 k2)))) (lambda (k1 k2) (g (lambda () (k1 k1 k2)) (lambda () (k2 k1 k2)))))) (lambda (p1g p2g) (lambda () ((p1g) items))) (lambda (p11g p12g) (lambda (items) (if (pair? items) (if (eq? item (car items)) ((p11g) (cdr items)) (begin ((p12g) items (cdr items)) items)) items))) (lambda (p21g p22g) (lambda (last this) (if (pair? this) (if (eq? item (car this)) (set-cdr! last ((p21g) (cdr this))) ((p22g) this (cdr this))) this))))))) Examples where the compiler "loses": ;; version 6 "non-subroutinized" "standard" applicative order (define fact ((lambda (f) ((lambda (g) (f (lambda () (g g)))) (lambda (g) (f (lambda () (g g)))))) (lambda (fg) (lambda (n) (if (= n 0) 1 (* n ((fg) (- n 1)))))))) ;; version 7 another "non-subroutinized" "standard" applicative order (define fact ((lambda (f) ((lambda (g) (f (lambda (x) ((g g) x)))) (lambda (g) (f (lambda (x) ((g g) x)))))) (lambda (fact) (lambda (n) (if (= n 0) 1 (* n (fact (- n 1)))))))) Version 6 does not win because the compiler is not smart enough to realize that both copies of (lambda () (g g)) are indistinguishable. This causes (lambda (n) ...) to be represented as a closure (with one free variable, namely fg), and even though the compiler realizes that the call (fg) always returns a version of (lambda (n) ...) and therefore codes the call as a direct branch, there is still overhead associated with the closure and computing its "next" version (which will be unused). Version 7 does not win for a similar reason. Again, the compiler does not realize that both copies of (lambda (x) ...) are identical and therefore causes (lambda (n) ...) to be closed over fact. Both calls ((g g) x) are known to be calls to (lambda (n) ...), and are translated as direct branches after computing the "next" (useless) closure for (lambda (n) ...) There are many other cases where it will lose due to the "non-uniqueness" of the values which may reach individual variables. An altogether different reason why it may lose is the fact that once a closure is detected, a "pessimistic" code generation strategy is used which forces some of the "useless" procedures and overhead to reappear.