Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!shadooby!samsung!gem.mps.ohio-state.edu!apple!bbn!clsib21!lpi!david From: david@lpi.UUCP (David Michaels) Newsgroups: comp.lang.c++ Subject: C++ syntactic ambiguity (long/tedious) Message-ID: <405@lpi.UUCP> Date: 9 Nov 89 21:23:32 GMT Organization: Language Processors Inc., Framingham MA Lines: 88 ----------------------------------------------- The following is probably only of interest to C++ language-lawyer/compiler-writer types. ----------------------------------------------- Besides the well known and well documented declaration-statement vs. expression-statement ambiguity (which requires arbitrary compiler look-ahead), there is another related function-declarator vs. object-initializer ambiguity. Assuming that "T" is a class type, consider the following declaration: T a (long (x)); Here, "a" could conceivably be either of the following: 1. A function returning type "T" and taking one argument of type "long"; in this case the "x" is a dummy parameter name enclosed in (redundant) parentheses. 2. An object of type "T" being initialized (via a class constructor) with the value of the (function-style cast) expression "long (x)". AT&T C++ (2.0) seems to choose #2. This choice seems to be based on a a new and incompatible (with old C and ANSI C) prohibition of redundant parentheses around declarators; the simple declaration "int (x);" (inside a function definition) has been rendered illegal. In "The C++ Answer Book" by Tony Hansen (on page 522 in appendix A), there is a statement that superfluous parentheses in declarations, while legal in ANSI C, are *not* permitted in C++. But in "C++: From Research to Practice" by S.B. Lippman and B.E. Moo (in the 1988 USENIX C++ Conference Proceedings) there is an indication that C++ *does* allow extraneous parentheses in declarations. In addition, in "The Evolution of C++: 1985 to 1987" by Bjarne Stroustrup, there is an assertion that redundant parentheses are *illegal* in declarations, but in the follow up "The Evolution of C++: 1985 to 1989", that assertion seems to have been removed. I spoke very briefly with Andrew Koenig about this at the recent "C++ at Work-'89" Conference; he said that he wasn't at that moment entirely sure what was currently implemented, but he thought that #1 in the example above should be chosen, and seemed to be fairly certain that it was *not* intended that redundant parentheses in declarators be disallowed. If redundant parentheses in declarators should indeed be permitted in C++ (I think they should), and if the disambiguation rule is indeed to choose a function-declarator over class-initializer in an ambiguous construct (i.e. similar to the way in which a declaration-statement is chosen over an expression-statement), then I have the following questions/comments. 1. This rule seems fine except that it doesn't seem to yield the most expected (least surprising) behavior, because by just looking at the example above, you would probably pick interpretation #2 since normally people don't put redundant parentheses around declarators, and if you really wanted interpretation #2 you would have to do something special like surround "int (x)" in parentheses or use the old-style cast construct. Perhaps redundant parentheses in *parameter* declarators should be disallowed after all ? 2. I assume that (as with the declaration-statement vs. expression-statement ambiguity) the disambiguation is purely syntactic; that is, the meaning of identifiers (beyond whether they are type-names or not) is not to be considered during disambiguation. In particular, the disambiguation will not consider whether or not the declaration occurs within function scope, whether or not the type has a constructor, or whether or not the type is even a class type. 3. How smart/thorough should the disambiguation be ? Consider this case: T a (long (x), long (x+1), long (x)) If we just look-ahead at the *first* parameter-declaration/constructor- argument then since it *could* be a parameter-declaration we would assume we are looking at a function declarator and we would (begin to) parse "a" as a function returning type "T" etc. and get a syntax error when looking at the second parameter-declaration. If however we look at *all* of the parameter-declarations/constructor-arguments, we would interpret "a" as an object of type "T" being initialized by three arguments (all of which are function-like cast expressions); this is more desirable I think. Just as a quality-of-implementation issue, the look-ahead process should probably terminate as soon as an unambiguous parameter-declaration is found (in which case we'd disambiguate to a function declarator) or as soon as an unambiguous expression (or illegal declaration) is found (in which case we'd disambiguate to a class initializer). Phew, maybe I got a little carried away but we'd like to get this right. Can anyone clear this up further ? Thanks. - David Michaels (david@lpi.uucp) Language Processors, Inc. (LPI) Framingham, MA 01701-4613 (508) 626-0006