Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!asuvax!ncar!elroy.jpl.nasa.gov!sdd.hp.com!wuarchive!mit-eddie!bloom-beacon!dont-send-mail-to-path-lines
From: lyn@altdorf.ai.mit.EDU (Franklyn Turbak)
Newsgroups: comp.lang.scheme
Subject: Why macros impair readability (long)
Message-ID: <9103041928.aa21342@mc.lcs.mit.edu>
Date: 5 Mar 91 00:27:58 GMT
Sender: daemon@athena.mit.edu (Mr Background)
Organization: The Internet
Lines: 375


This is a long message on why macros impair readability.  Read the
abstract for a summary.  I'm interested in hearing feedback, alternate
models, and lots of anecdotes and examples.  

- Lyn -

---------------------------------------------------------------------------

		    WHY MACROS IMPAIR READABILITY
					
			   Franklyn Turbak
			    March 4, 1991


ABSTRACT
--------

Arguments about program readability are often based on many implicit
assumptions about the definitions of "reader" and "readability".  Such
arguments would be more compelling if (1) these definitions were made
explicit and (2) the arguments were based on technical considerations
in terms of concrete examples.  Using an interpretation where
"readable" means "supporting local reasoning about program fragments",
I discuss four reasons why macros impair readability in Scheme:
  
  1. Applicative Order Evaluation
  2. Static Scope
  3. Procedures as First-Class Objects 
  4. Debugging

I conclude with an entreaty for alternate analyses and anecdotes of how
macros help/hinder code readability.

WHAT IS READABILITY?
--------------------

  In the recent debate on the Scheme mailing list about the advantages
and disadvantages of macros, a number of arguments were made about
"readability" of code containing macros.  But the notion of
readability is rather slippery.  First of all, "readable" code admits
many possible interpretations, including:

    * Concise - free of baggage not important to the ideas being expressed.

    * Understandable - matched with the reader's mental structures.

    * Expressive - accurately conveys the writer's mental structures.

    * Modifiable - structured to permit extensions and variations.

    * Well-documented - contains helpful descriptions, comments, names.

    * Verifiable - aids in the proof (by people or machines) of properties 
                   of the described process.

    * Recallable - easy to remember or rederive.

    * Teachable - explainable to someone else.

  Second, all these  interpretations depend a heck of a lot on who  the
reader is.  Readers vary widely in background, programming skill, and
purpose for reading the code.  A "most readable" style is a fiction;
what is clear and concise to some readers may be inscrutable to others.
Many people can (and do) agree on matters of programming style; nevertheless,
different styles are tuned to different kinds of readers.  For example:

    * A WHILE macro may be a boon to an imperative thinker and anathema 
      to a functional thinker.

    * Extensive documentation that aids some readers gets in the way of other 
      readers who want to see more of the code in single editor buffer.  

    * Conventions like thunking args to delay evaluation, implementing 
      message-passing objects as procedures, or using continuation-passing
      style to achieve nonstandard control flow are clear as day to those 
      facile with these techniques, but (1) pose difficulties to those
      not familiar with these devices and (2) are candidates for abstraction
      by those who believe such details obscure the essence of the code.

    * An interpreter using concrete rather than abstract syntax is 
      well-suited for class presentation because it is shorter and is more 
      likely to fit in its entirety on a blackboard.  On the other hand, 
      a version with abstract syntax may be better suited to lab study, 
      where the code readers may want to implement an alternate syntax.

    * For a person who simply wants to use a given program, a description 
      of its interface & behavior and clearly marked entry points are 
      crucial.  For someone attempting to extend a program, hierarchical
      structure and accessibility of "hooks" are important.  Clear structuring
      of data and control flow are essential for readers who want to understand
      particular algorithms.


TOWARDS A MORE OBJECTIVE ANALYSIS OF READABILITY
------------------------------------------------

  Given the above, it's easy to see how discussions about readability
can easily degenerate into religious squabbles.  If everyone assumes
his/her own interpretation of "readable" and "reader", then people
aren't really debating the same issue.  One way to improve the
situation is for discussants to be more explicit about their
assumptions.  Another improvement would be the use of specific examples
rather than vague generalities.  "This particular macro
improves/impairs readbility because ..." is much more convincing than 
nebulous claims about factors enhancing or detracting from readabilty.

  But even more desirable would be arguments with a more formal or
objective basis for comparison.  In light of this goal, I consider
four linguistic issues to illustrate why Scheme code using macros can
less be readable than Scheme code without macros.  Here, I use the
term "readable" to mean "easy to reason about locally", where locally
refers to the fact that certain conclusions can be made about a code
fragment without knowing the full context in which it occurs.  I also
assume that the program being read is a large one, so that there is a
nontrivial overhead to obtaining global information, such as finding
top-level definitions.  Local reasoning is particularly valuable in
such situations.  Finally, I assume that the reader desires a detailed
understanding of the code, not just a feel for it's high-level structure.

1. APPLICATIVE ORDER EVALUATION 
-------------------------------

   A common use of macros is to simulate normal order evaluation of arguments
   within Scheme's applicative framework.  For example, it is possible
   to implement lazy pairs by the desugarings:

      (LAZY-CONS <exp1> <exp2>) => (CONS (LAMBDA () <exp1>) (LAMBDA () <exp2>))

      (LAZY-CAR <exp>) => ((CAR <exp>))
      (LAZY-CDR <exp>) => ((CDR <exp>))

   (Both LAZY-CAR and LAZY-CDR could be procedures, but LAZY-CONS must be
   a macro.)

   Although using macros in this way can reduce the clutter of thunks, it
   makes it more difficult to reason about the evaluation of expressions
   that appear in the argument positions of a procedure/macro call.
   In macro-less Scheme, for example, the expression

      (unknown (letrec ((loop (lambda () 
                                (loop)))) 
                 (loop)))

   must be nonterminating regardless of the meaning of UNKNOWN because 
   all arguments must be evaluated before the procedure is called.  But 
   in the presence of macros, an argument expression may be evaluated 
   zero times, so the above could return a value.  Macros
   require the reader to use more global knowledge to understand this 
   fragment.

   Similarly, with macros an argument expression might be evaluated
   more than once.  This can wreak havoc in the presence of side effects.
   In the expression

      (let ((x 0))
        (unknown (begin (set! x (+ x 1)) 
                        17)))

   X is incremented only once in macro-less Scheme, but might be 
   incremented any number of times depending on the definition of 
   UNKNOWN, if it were a macro.

   Granted, the above examples are contrived, and it is generally
   considered bad policy to have side-effects in argument positions.
   Nevertheless, the same problems can crop up in much more natural situations.
   The point is that local reasoning valid in a purely applicative-order
   language is no longer necessarily valid in the presence of macros.

   Note that this problem is ameliorated by Aubrey Jaffer's suggestion
   of distinguishing macro names from procedure names (or macro calls from
   procedure calls).  In that case, the usual Scheme reasoning can be 
   used in the vast majority of the cases (procedure calls), but the 
   potentially troublesome cases are syntactically flagged.

2. STATIC SCOPING
-----------------

   The kind of lexical reasoning enabled by Scheme's static scope can be 
   invalidated in the presence of macros.  Consider the expression:

       (let ((return (lambda (n) (* 2 n))))
         (block 
           (+ 100 (return 3))))

     If BLOCK were a procedure, then the RETURN that appears within its
   argument would have to refer to the multiply-by-two procedure, and the
   meaning of the expression would be the same as that of 

       (block 106)

   Of course, we couldn't say more about the meaning of the whole expression
   until we also knew more about the behavior of the BLOCK procedure. But
   it would still be possible to make a firm conclusion about the value of 
   BLOCK's argument without any more global information.

     In the presence of macros, all bets are off, since BLOCK might be a 
   macro that (intentionally) binds the name RETURN within its scope.
   E.g., a desugaring for BLOCK might be:

       (BLOCK <exp>) => (CALL-WITH-CURRENT-CONTINUATION
                          (LAMBDA (RETURN) <exp>))

   (Note this naming issue is different than the "accidental name capture"
   problem associated with faulty macro implementations.  Here the 
   macro writer really wants the name RETURN to be captured.)

   The BLOCK macro might even treat its entire argument as text:

       (BLOCK <exp>) => (QUOTE <exp>)

   Here, the name RETURN is just a symbol, and not a variable
   reference after all.

   Here we have a situation where the introduction of macros has the
   potential of complicating the scope rules that programmers use to 
   perform local reasoning about their programs.  Counterarguments to 
   this point are:

     (1) Such examples are extremely rare.

     (2) Macros that intentionally bind names should not be allowed.

   But the fact remains that in the presence of macros, the *possibility*
   that names might not have their normal lexical interpretation must
   at least be considered by the reader.

   Again, Jaffer's proposal to syntactically distinguish macros from 
   procedures would clearly delineate those regions of code where the
   usual reasoning about scoping might not apply.


3. PROCEDURES AS FIRST-CLASS OBJECTS
------------------------------------

   Scheme encourages programmers to exploit the first-class nature of 
   procedures.  Thus, procedures are commonly named, passed as arguments,
   returned as results, and stored in data structures. A disadvantage
   of macros is that they cannot be treated in this way.  For example,
   AND is commonly treated as a macro with the desugaring:

     (AND <exp1> <exp2>) => (IF <exp1> <exp2> #f)

   There are situations where it is desirable to pass AND as an argument:

       (define (accumulate combiner null-value lst)
         (if (null? lst)
             null-value
             (combiner (car lst)
                       (accumulate combiner null-value (cdr lst)))))

       (define (all-true? lst)
         (accumulate and #t lst))

   Unfortunately, this doesn't work; the AND must be encapsulated into 
   a procedure before it can be passed:

       (define (all-true? lst)
         (accumulate (lambda (x y) (and x y)) #t lst))

     This problem is not so much one of macros destroying local reasoning
   properties but rather one of verbosity and inconsistency.  Still,
   without knowing the definiton of AND, a reader modifying code cannot
   safely replace (LAMBDA (X Y) (AND X Y)) by AND.  Such a local 
   modification would be valid in macro-less Scheme. 

     Yet again, a convention flagging macro names would alleviate the
   situation.
   
4. DEBUGGING
------------

     Though macros may aid in making source code more concise, the
   macro-expanded code can often be rather unwieldy.  The expanded
   code is normally hidden from the reader, but often rears its ugly head
   during debugging. 

     Consider an example from the Mini-FX programming language 
   used in the graduate programming languages course at MIT.
   (Mini-FX is a simplified version of Dave Gifford's FX language
   implemented as a macro package on top of Scheme).  Mini-FX supports
   a powerful pattern matching construct called MATCH. Below is 
   an example where MATCH is used in the definition of a list reversal
   procedure:

      (define (reverse lst)
        (match lst
          ('() '())
          (`(,first ,@rest) (append (reverse rest) (list first)))))

   My goal here isn't to describe the semantics of MATCH, but simply
   to show that it can lead to extremely complex macro expansions.
   The above definition expands into:

      (define (reverse lst)
       (if (equal? lst '())
           '()          
           (let ((#fail-25 (lambda ()
                             (error 
                              (string-append "MINI-FX RUNTIME ERROR (This error should be caught by the typechecker!):\n" 
                                             "MATCH -- no pattern matched")
                              lst))))
             (list->sexp~ lst
               (lambda #success-arg-27
                 (if (not (= (*minifx-length* #success-arg-27) 1))
                     (*minifx-success-number-of-args-mismatch* 
                      '((cons~ first rest))
                      #success-arg-27)
                     (apply
                      (lambda (#temp-26)
                        (cons~ #temp-26
                          (lambda #success-arg-28
                            (if (not (= (*minifx-length* #success-arg-28) 2))
                                (*minifx-success-number-of-args-mismatch*
                                 '(first rest)           
                                 #success-arg-28)
                                (apply
                                 (lambda (first rest)
                                   (append (reverse rest) (list first)))   
                                 #success-arg-28)))
                          #fail-25))
                      #success-arg-27)))
               #fail-25))))

      A user who makes an error within a MATCH clause will be thrown into
   a debugger that has access to the verbose expanded code but not the concise
   unexpanded code. Here we have yet another kind of code reader facing
   locality difficulties of a different sort. In this case the expressive 
   advantages offered by syntactic abstraction have disappeared, and the reader
   is left with the job of matching up the expanded code with the appropriate
   section of the source code. Had procedural abstraction been used instead,
   this matching up process would be greatly simplified.

      This problem seems less intrinsic than the others because it seems
   possible to design a "smart" debugger that would aid in the inverse
   of macro expansion (= macro contraction?).  Nevertheless, in Scheme
   systems I have seen, the above problem is very real one for the
   reader-as-debugger.

      Note that this problem is due to the very nature of macros; Jaffer's
   syntactic distinction scheme will not help here.


DISCUSSION
----------

  Please note that I am *not* claiming that Scheme without macros is
inherently readable.  Such a claim is absurd, because it is possible
to write bad programs in any language.  And as Mark Friedman and
others have pointed out, there are many reasons why macro-less Scheme
can be hard to read.

  I also do not claim that introduction of macros into a program
always makes it less readable. There are many situations where a
judicious use of macros makes code more understandable by abstracting
over the particular mechanism that implements a behavior.
(Unfortunately, macros are notoriously hard to write well; macrology
is quite a black art.)

  What I *am* claiming is that macros introduce a new set of
*potential* reasoning difficulties in addition to the ones that are
already present in macro-less code.  Whether these difficulties
*actually* impair programmers' reasoning in practice is an empirical
issue.  Many of the above examples are simple and contrived.  I'd like
to hear about specific cases where people think these (or other)
issues were at play in macros hindering their reasoning.  

  Note that in three of the four points raised above, Jaffer's
syntactic distinction idea seemed well-motivated.  This conclusion is
based on my particular assumptions.  Of course, there are other
assumptions and models under which macros improve reasoning and
macro/procedure syntactic distinction unduly complicate programs.  I
entreat people to make such assumptions and models explicit in their
readability arguments, and to make liberal use of examples in illustrating
their viewpoints.