Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!shadooby!samsung!usc!snorkelwacker!spdcc!ima!esegue!compilers-sender From: worley@compass.com (Dale Worley) Newsgroups: comp.compilers Subject: Obsession with lexical and syntactic issues Message-ID: <1989Nov15.193343.2017@esegue.segue.boston.ma.us> Date: 15 Nov 89 19:33:43 GMT Sender: compilers-sender@esegue.segue.boston.ma.us Reply-To: worley@compass.com (Dale Worley) Organization: Compilers Central Lines: 54 Approved: compilers@esegue.segue.boston.ma.us > Who >really< cares about syntax anyway? Well, the customer, for one. Syntax-related issues in compilers "in the real world" are *not* trivial. For instance: - Constant revisions/changes to the grammar This is where tools that turn a grammar into a parser win big over hand-coded parsers -- the language *is* going to go through 23 revisions before the compiler goes out the door. Sure, it started out as "ANSI Standard", but the customer would like a couple of extra features... - Verifying that the new language features don't introduce ambiguities into the language Unless you design your language with an eye to making all constructions *obviously* different, you will introduce ambiguities. I've been involved with compilers for C, Algol 68, and a Cobol-like business application language, and seen enough about Fortran and PL/1, to know that this sort of problem is always biting you. - Automatically producing good error recovery from syntax errors This is clearly a major research area, and it is important in any compiler that someone is actually going to use to write programs. Even figuring out how error messages "ought" to be presented is still unknown. - Language designers steadfastly refuse to make LALR(1) languages I've never yet seen a major programming language that was truly LALR(1). And usually the points where they depart from LALR(1) are seriously ugly -- consider the problem of tagging all the typedef names in a C program in a truly ANSI Standard way. - Building a lex/parse system that will Do What I Mean Lexers and parsers specifications are *still* more complicated than they "ought to be". Anything that can be done to put more wisdom in the generator will provide immediate payoff. Another reason that this is still an important area of study is that we write many parsers, whereas, at least in theory, a good global optimizer should be usable with little change in compilers for many languages. Dale Worley Compass, Inc. worley@compass.com -- Send compilers articles to compilers@esegue.segue.boston.ma.us {spdcc | ima | lotus}!esegue. Meta-mail to compilers-request@esegue. Please send responses to the author of the message, not the poster.