Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!emory!att!att!cbnewsk!pegasus!hansen From: hansen@pegasus.att.com (Tony L. Hansen) Newsgroups: comp.lang.c++ Subject: Re: C++ Grammar Summary: need symbol table, not change in architecture Message-ID: <1991Feb23.180429.1642@cbnewsk.att.com> Date: 23 Feb 91 18:04:29 GMT References: <70609@microsoft.UUCP> <112@shasta.Stanford.EDU> <11373@pasteur.Berkeley.EDU> Sender: hansen@cbnewsk.att.com (tony.l.hansen) Organization: AT&T Bell Laboratories Lines: 42 << From: shap@shasta.Stanford.EDU (shap; Jonathan) << In article <112@shasta.Stanford.EDU>, shap@shasta.Stanford.EDU (shap) writes: << The problem with Roskind's grammar (at least the last time I looked - << it may have been updated in the past few months) is that it requires << the lexer to resolve "typedef" v/s "identifier", which is probably the << most difficult problem in the language as far as parsing goes. If << anyone has built a tool that overcomes this limitation, including the << associated symtab support, I would be interested to see it. < From: jbuck@galileo.berkeley.edu (Joe Buck) < I've seen this point raised for years and years and years, and I've < never understood it (mainly about typedefs in C, but it applies to < C++). Why do people object so strenously to the parser asking the < lexer for help when distinguishing identifiers from types? The < implementation is obvious, it's clean, it's easy to do. Once a token < has been declared as a type, mark it as such in the symbol table, so < the next time the token appears, the lexer returns the token indicating < that it's a type, not an identifier. It's simple, it's clean, and it < works. But yet there is a faction that screams about the "impurity" of < it. Why? < < Actually, I doubt if C++ can be parsed without using this trick, with < anyone's grammar. I think what Jonathan is complaining about is NOT the difficulty with parsing C++ and the fact that a smart lexer&symbol-table is necessary, but that no one has posted one which will work with the grammar. In addition, the job is NOT as simple as what you imply, as you have to worry about various scopes coming into play. The name X may be a type name or it may be a variable name, depending entirely on context. Once you decide that X is a type name, you can't just always tag it as a type name; you have to know the context in which X is being used before deciding on which it is. The job is even messier now because of nested classes. You essentially have to have multiple symbol tables which are linked together. You essentially start at the innermost symbol table and work your way out. Even that description is not complete; even the ANSI C++ committee is having difficulty coming up with a set of rules which accurately describe how to search for a given symbol. Tony Hansen att!pegasus!hansen, attmail!tony hansen@pegasus.att.com tony@attmail.com