Path: utzoo!utgpu!attcan!uunet!mcvax!hp4nl!botter!star.cs.vu.nl!leendert@cs.vu.nl From: leendert@cs.vu.nl (Leendert van Doorn) Newsgroups: comp.std.c Subject: Re: ANSI C token set (including $ and @) Keywords: ANSI C, token set Message-ID: <1858@zell.cs.vu.nl> Date: 5 Jan 89 12:21:07 GMT References: <11343@haddock.ima.isc.com> Sender: leendert@cs.vu.nl Reply-To: leendert@cs.vu.nl () Organization: VU Informatica, Amsterdam Lines: 67 The following comments are based on the X3J11/88-090 (may/88) version of the dpANS report. In a couple of days I'll get the latest version, but for now it will do. In article <11343@haddock.ima.isc.com> karl@haddock.ima.isc.com writes: > Let's see if I've got this straight yet. > >o `$' is required to scan as a separate pp-token, despite existing practice > making it an optional identifier-character. Yes. The syntax of an identifier is (par. 3.1.2): identifier: nondigit | identifier nondigit | identifier digit ; nondigit: "_[a-z][A-Z]" digit: "0-9" Whether the '$' should be scanned as a separate pp-token depends on the source character set. >o When converting pp-tokens to tokens, an implementation is free to merge > {foo}{$}{bar} into a single token {foo$bar}. (I'm guessing on this one.) No, in this conversion the '$' is a garbage character. So what you get is {foo} {bar}. (the $ character is not part of the non-terminal identifier, see above). >o But, since macro expansion happens first, it is {foo}, and not {foo$bar}, > that is subject to macro replacement, even if the above is true. {foo$bar} can never be subject to any macro replacement, since it's not an identifier (see 3.8.3). >o Hence, certain features of DEC and APOLLO implementations cannot be > conforming. I don't know about DEC or APOLLO, but if they allow things like described above their implementations are not strictly conforming (perhaps there is a flag -pendatic as with the GNU C compiler ?). >o DEC and APOLLO, through their representatives on X3J11, are aware of the > above and accept it. Their ANSI C implementations, if any, will not use > `$' in identifiers. Depends on there policy. They are free to add features. Perhaps they will make a flag (if $ is the only nonconforming aspect). >o Non-English letters, which are clearly not usable in a strictly conforming > program, are in fact not usable in *any* conforming program, for the same > reasons that apply to `$'. The basic source set, the set in which source files are written, does not contain $, umlaut, accent grave, etc. The strings however, may contains these characters (depending on the size of the character representation you could use single or multibyte character strings). >o The international community is aware of this and accepts it. Yep, why not ? BTW: The best wishes for 1989. "Hope it's a good one" -- Leendert P. van Doorn Vrije Universiteit / Dept. of Maths. & Comp. Sc. De Boelelaan 1081 1081 HV Amsterdam / The Netherlands tel. +31 20 548 5302