Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!asuvax!ncar!elroy.jpl.nasa.gov!sdd.hp.com!wuarchive!mit-eddie!bloom-beacon!dont-send-mail-to-path-lines From: lyn@altdorf.ai.mit.EDU (Franklyn Turbak) Newsgroups: comp.lang.scheme Subject: Why macros impair readability (long) Message-ID: <9103041928.aa21342@mc.lcs.mit.edu> Date: 5 Mar 91 00:27:58 GMT Sender: daemon@athena.mit.edu (Mr Background) Organization: The Internet Lines: 375 This is a long message on why macros impair readability. Read the abstract for a summary. I'm interested in hearing feedback, alternate models, and lots of anecdotes and examples. - Lyn - --------------------------------------------------------------------------- WHY MACROS IMPAIR READABILITY Franklyn Turbak March 4, 1991 ABSTRACT -------- Arguments about program readability are often based on many implicit assumptions about the definitions of "reader" and "readability". Such arguments would be more compelling if (1) these definitions were made explicit and (2) the arguments were based on technical considerations in terms of concrete examples. Using an interpretation where "readable" means "supporting local reasoning about program fragments", I discuss four reasons why macros impair readability in Scheme: 1. Applicative Order Evaluation 2. Static Scope 3. Procedures as First-Class Objects 4. Debugging I conclude with an entreaty for alternate analyses and anecdotes of how macros help/hinder code readability. WHAT IS READABILITY? -------------------- In the recent debate on the Scheme mailing list about the advantages and disadvantages of macros, a number of arguments were made about "readability" of code containing macros. But the notion of readability is rather slippery. First of all, "readable" code admits many possible interpretations, including: * Concise - free of baggage not important to the ideas being expressed. * Understandable - matched with the reader's mental structures. * Expressive - accurately conveys the writer's mental structures. * Modifiable - structured to permit extensions and variations. * Well-documented - contains helpful descriptions, comments, names. * Verifiable - aids in the proof (by people or machines) of properties of the described process. * Recallable - easy to remember or rederive. * Teachable - explainable to someone else. Second, all these interpretations depend a heck of a lot on who the reader is. Readers vary widely in background, programming skill, and purpose for reading the code. A "most readable" style is a fiction; what is clear and concise to some readers may be inscrutable to others. Many people can (and do) agree on matters of programming style; nevertheless, different styles are tuned to different kinds of readers. For example: * A WHILE macro may be a boon to an imperative thinker and anathema to a functional thinker. * Extensive documentation that aids some readers gets in the way of other readers who want to see more of the code in single editor buffer. * Conventions like thunking args to delay evaluation, implementing message-passing objects as procedures, or using continuation-passing style to achieve nonstandard control flow are clear as day to those facile with these techniques, but (1) pose difficulties to those not familiar with these devices and (2) are candidates for abstraction by those who believe such details obscure the essence of the code. * An interpreter using concrete rather than abstract syntax is well-suited for class presentation because it is shorter and is more likely to fit in its entirety on a blackboard. On the other hand, a version with abstract syntax may be better suited to lab study, where the code readers may want to implement an alternate syntax. * For a person who simply wants to use a given program, a description of its interface & behavior and clearly marked entry points are crucial. For someone attempting to extend a program, hierarchical structure and accessibility of "hooks" are important. Clear structuring of data and control flow are essential for readers who want to understand particular algorithms. TOWARDS A MORE OBJECTIVE ANALYSIS OF READABILITY ------------------------------------------------ Given the above, it's easy to see how discussions about readability can easily degenerate into religious squabbles. If everyone assumes his/her own interpretation of "readable" and "reader", then people aren't really debating the same issue. One way to improve the situation is for discussants to be more explicit about their assumptions. Another improvement would be the use of specific examples rather than vague generalities. "This particular macro improves/impairs readbility because ..." is much more convincing than nebulous claims about factors enhancing or detracting from readabilty. But even more desirable would be arguments with a more formal or objective basis for comparison. In light of this goal, I consider four linguistic issues to illustrate why Scheme code using macros can less be readable than Scheme code without macros. Here, I use the term "readable" to mean "easy to reason about locally", where locally refers to the fact that certain conclusions can be made about a code fragment without knowing the full context in which it occurs. I also assume that the program being read is a large one, so that there is a nontrivial overhead to obtaining global information, such as finding top-level definitions. Local reasoning is particularly valuable in such situations. Finally, I assume that the reader desires a detailed understanding of the code, not just a feel for it's high-level structure. 1. APPLICATIVE ORDER EVALUATION ------------------------------- A common use of macros is to simulate normal order evaluation of arguments within Scheme's applicative framework. For example, it is possible to implement lazy pairs by the desugarings: (LAZY-CONS ) => (CONS (LAMBDA () ) (LAMBDA () )) (LAZY-CAR ) => ((CAR )) (LAZY-CDR ) => ((CDR )) (Both LAZY-CAR and LAZY-CDR could be procedures, but LAZY-CONS must be a macro.) Although using macros in this way can reduce the clutter of thunks, it makes it more difficult to reason about the evaluation of expressions that appear in the argument positions of a procedure/macro call. In macro-less Scheme, for example, the expression (unknown (letrec ((loop (lambda () (loop)))) (loop))) must be nonterminating regardless of the meaning of UNKNOWN because all arguments must be evaluated before the procedure is called. But in the presence of macros, an argument expression may be evaluated zero times, so the above could return a value. Macros require the reader to use more global knowledge to understand this fragment. Similarly, with macros an argument expression might be evaluated more than once. This can wreak havoc in the presence of side effects. In the expression (let ((x 0)) (unknown (begin (set! x (+ x 1)) 17))) X is incremented only once in macro-less Scheme, but might be incremented any number of times depending on the definition of UNKNOWN, if it were a macro. Granted, the above examples are contrived, and it is generally considered bad policy to have side-effects in argument positions. Nevertheless, the same problems can crop up in much more natural situations. The point is that local reasoning valid in a purely applicative-order language is no longer necessarily valid in the presence of macros. Note that this problem is ameliorated by Aubrey Jaffer's suggestion of distinguishing macro names from procedure names (or macro calls from procedure calls). In that case, the usual Scheme reasoning can be used in the vast majority of the cases (procedure calls), but the potentially troublesome cases are syntactically flagged. 2. STATIC SCOPING ----------------- The kind of lexical reasoning enabled by Scheme's static scope can be invalidated in the presence of macros. Consider the expression: (let ((return (lambda (n) (* 2 n)))) (block (+ 100 (return 3)))) If BLOCK were a procedure, then the RETURN that appears within its argument would have to refer to the multiply-by-two procedure, and the meaning of the expression would be the same as that of (block 106) Of course, we couldn't say more about the meaning of the whole expression until we also knew more about the behavior of the BLOCK procedure. But it would still be possible to make a firm conclusion about the value of BLOCK's argument without any more global information. In the presence of macros, all bets are off, since BLOCK might be a macro that (intentionally) binds the name RETURN within its scope. E.g., a desugaring for BLOCK might be: (BLOCK ) => (CALL-WITH-CURRENT-CONTINUATION (LAMBDA (RETURN) )) (Note this naming issue is different than the "accidental name capture" problem associated with faulty macro implementations. Here the macro writer really wants the name RETURN to be captured.) The BLOCK macro might even treat its entire argument as text: (BLOCK ) => (QUOTE ) Here, the name RETURN is just a symbol, and not a variable reference after all. Here we have a situation where the introduction of macros has the potential of complicating the scope rules that programmers use to perform local reasoning about their programs. Counterarguments to this point are: (1) Such examples are extremely rare. (2) Macros that intentionally bind names should not be allowed. But the fact remains that in the presence of macros, the *possibility* that names might not have their normal lexical interpretation must at least be considered by the reader. Again, Jaffer's proposal to syntactically distinguish macros from procedures would clearly delineate those regions of code where the usual reasoning about scoping might not apply. 3. PROCEDURES AS FIRST-CLASS OBJECTS ------------------------------------ Scheme encourages programmers to exploit the first-class nature of procedures. Thus, procedures are commonly named, passed as arguments, returned as results, and stored in data structures. A disadvantage of macros is that they cannot be treated in this way. For example, AND is commonly treated as a macro with the desugaring: (AND ) => (IF #f) There are situations where it is desirable to pass AND as an argument: (define (accumulate combiner null-value lst) (if (null? lst) null-value (combiner (car lst) (accumulate combiner null-value (cdr lst))))) (define (all-true? lst) (accumulate and #t lst)) Unfortunately, this doesn't work; the AND must be encapsulated into a procedure before it can be passed: (define (all-true? lst) (accumulate (lambda (x y) (and x y)) #t lst)) This problem is not so much one of macros destroying local reasoning properties but rather one of verbosity and inconsistency. Still, without knowing the definiton of AND, a reader modifying code cannot safely replace (LAMBDA (X Y) (AND X Y)) by AND. Such a local modification would be valid in macro-less Scheme. Yet again, a convention flagging macro names would alleviate the situation. 4. DEBUGGING ------------ Though macros may aid in making source code more concise, the macro-expanded code can often be rather unwieldy. The expanded code is normally hidden from the reader, but often rears its ugly head during debugging. Consider an example from the Mini-FX programming language used in the graduate programming languages course at MIT. (Mini-FX is a simplified version of Dave Gifford's FX language implemented as a macro package on top of Scheme). Mini-FX supports a powerful pattern matching construct called MATCH. Below is an example where MATCH is used in the definition of a list reversal procedure: (define (reverse lst) (match lst ('() '()) (`(,first ,@rest) (append (reverse rest) (list first))))) My goal here isn't to describe the semantics of MATCH, but simply to show that it can lead to extremely complex macro expansions. The above definition expands into: (define (reverse lst) (if (equal? lst '()) '() (let ((#fail-25 (lambda () (error (string-append "MINI-FX RUNTIME ERROR (This error should be caught by the typechecker!):\n" "MATCH -- no pattern matched") lst)))) (list->sexp~ lst (lambda #success-arg-27 (if (not (= (*minifx-length* #success-arg-27) 1)) (*minifx-success-number-of-args-mismatch* '((cons~ first rest)) #success-arg-27) (apply (lambda (#temp-26) (cons~ #temp-26 (lambda #success-arg-28 (if (not (= (*minifx-length* #success-arg-28) 2)) (*minifx-success-number-of-args-mismatch* '(first rest) #success-arg-28) (apply (lambda (first rest) (append (reverse rest) (list first))) #success-arg-28))) #fail-25)) #success-arg-27))) #fail-25)))) A user who makes an error within a MATCH clause will be thrown into a debugger that has access to the verbose expanded code but not the concise unexpanded code. Here we have yet another kind of code reader facing locality difficulties of a different sort. In this case the expressive advantages offered by syntactic abstraction have disappeared, and the reader is left with the job of matching up the expanded code with the appropriate section of the source code. Had procedural abstraction been used instead, this matching up process would be greatly simplified. This problem seems less intrinsic than the others because it seems possible to design a "smart" debugger that would aid in the inverse of macro expansion (= macro contraction?). Nevertheless, in Scheme systems I have seen, the above problem is very real one for the reader-as-debugger. Note that this problem is due to the very nature of macros; Jaffer's syntactic distinction scheme will not help here. DISCUSSION ---------- Please note that I am *not* claiming that Scheme without macros is inherently readable. Such a claim is absurd, because it is possible to write bad programs in any language. And as Mark Friedman and others have pointed out, there are many reasons why macro-less Scheme can be hard to read. I also do not claim that introduction of macros into a program always makes it less readable. There are many situations where a judicious use of macros makes code more understandable by abstracting over the particular mechanism that implements a behavior. (Unfortunately, macros are notoriously hard to write well; macrology is quite a black art.) What I *am* claiming is that macros introduce a new set of *potential* reasoning difficulties in addition to the ones that are already present in macro-less code. Whether these difficulties *actually* impair programmers' reasoning in practice is an empirical issue. Many of the above examples are simple and contrived. I'd like to hear about specific cases where people think these (or other) issues were at play in macros hindering their reasoning. Note that in three of the four points raised above, Jaffer's syntactic distinction idea seemed well-motivated. This conclusion is based on my particular assumptions. Of course, there are other assumptions and models under which macros improve reasoning and macro/procedure syntactic distinction unduly complicate programs. I entreat people to make such assumptions and models explicit in their readability arguments, and to make liberal use of examples in illustrating their viewpoints.