Path: utzoo!attcan!uunet!spool2.mu.edu!mips!pacbell.com!ucsd!sdcc6!beowulf!djohnson From: djohnson@beowulf.ucsd.edu (Darin Johnson) Newsgroups: comp.lang.functional Subject: Re: "Off-side rule" Message-ID: <15576@sdcc6.ucsd.edu> Date: 13 Jan 91 21:24:00 GMT References: <1991Jan11.100048.3121@odin.diku.dk> <27854.27905aa5@kuhub.cc.ukans.edu> Sender: news@sdcc6.ucsd.edu Organization: CSE Dept., UC San Diego Lines: 35 Nntp-Posting-Host: beowulf.ucsd.edu >In article , acha@CS.CMU.EDU (Anurag Acharya) writes: >: >: What is the justification for this "off-side" rule ? The idea of whitespace >: having semantics is a potential source of inscrutable bugs and, frankly >: speaking, seems to go against the grain of modern programming language >: design. The concrete syntax of such a language would no longer be >: context-free, >: let alone LR(1)/LL(1). In fact, I am hard pressed to conceptualize an >: efficient tokenizing algorithm for such languages. When you look at how strict and unforgiving Occam is towards spacing, the off-side rule is rather benign. However, there is the major "gotcha" in both languages - tabs count as one character. And unfortunately, back when I used vi, tabs would be inserted automatically and occam would give some meaningless error message that tooks hours to figure out. As far as justifying this, it adds to readability. One of the "goals" of functional programming languages is to be able to have programs look like formal mathematical functions. Block constructs detract from the readability somewhat, especially if you must add begin/end because {}, (), etc are already used. Probably a big reason is that it looks nice esthetically, without lots of filler tokens and symbols. Parsing isn't that bad at all. In fact, the lexical analyzer can handle it all, and the parser need no nothing about spacing. The method I used in a project was to insert a "begin" token whenever the beginning of a block was found (like just after =), and also push onto a stack the current position in the line. Then whenever reading a new line, if the first symbol was less than the position saved on the top of the stack, "end"'s were inserted into the token stream. It was very simple to add, and the parser always saw begin/end markers and was kept happy. [of course, you have to go to reading line by line, but this was done anyway so error messages had line numbers] -- Darin Johnson djohnson@ucsd.edu