Newsgroups: comp.std.c Path: utzoo!utgpu!jarvis.csri.toronto.edu!dgp.toronto.edu!hugh From: hugh@dgp.toronto.edu ("D. Hugh Redelmeier") Subject: Re: 0x47e+barney not considered C Message-ID: <8807030058.AA18406@explorer.dgp.toronto.edu> Organization: University of Toronto, CSRI References: <120200001@hcx2> Date: Sat, 2 Jul 88 19:38:19 EDT In article <120200001@hcx2> tom@hcx2.UUCP points out that under the draft ANSI standard for C, preprocessor numbers are too greedy. He gave the example of 0x47e+barney which is parsed as a preprocessor number, and then rejected when it cannot be converted to a legitimate C token. He is correct, and I agree with him in considering this a mistake. I too submitted a comment on this in the last public review period, and mine too seems to have been ignored (I have not received the response document). In article <10413@ulysses.homer.nj.att.com>, Jerry Schwarz says that Tom's article insults his construct, the pp-number, and that Tom's fix is bad. Furthermore, Jerry thinks the problem does not warrant a fix. I think that the pp-number construct did clean up a mess (which I too had been pointing out for a while). But it can and should be fixed to not break formerly valid and perfectly reasonable C programs. In article <8194@brl-smoke.ARPA>, Doug Gwyn asks: | Why do you think it so important for "0x47e" to be considered a | preprocessing number token? Just what is it that needs "fixing"? | Is it that "0x47e" is supposed to be split into preprocessing tokens | "0" and "x47e" (the second of which may be subject to macro | replacement!) and in translation phase 7 they are not said to be | spliced back together into a single (regular) token, so that it is | impossible for an integer constant "0x47e" to ever be seen after | phase 6? If so, that does seem to me to be a problem, but it has | nothing to do with "+barney" or with the final "e" on the constant; | it's a generic problem for all hex constants (and was certainly not | the committee's intention, so fixing this would presumably be | considered editorial). For me, the problem is that the +barney is absorbed into the hex constant. The + clearly ought to be a separate token, and so should the barney. Here is what I submitted to the committee in the previous round: Page 33, line 36, Section 3.1.8: preprocessor number too greedy (consider 0xABCDE+1) The current rules for parsing preprocessor numbers are too greedy. They are willing to match + or - after an e or E. If the e came from a floating point number, that is fine, but if it came from a hexadecimal number, it is not. Consider the following examples: 0xABCDE+1 0xABCDE+cat 0xABCDEF+1 0xABCDEF+cat The first two lines used to be legitimate C expressions. Now each is a pp-number that cannot be turned into a valid C token. The second two lines were and remain legitimate expressions. Although I think that the whole concept is wrong, it can be patched up to solve this problem. Proposed grammar: pp-number: integer-constant floating-constant pp-number digit pp-number nondigit pp-number . I find this definition intuitively appealing: it reflects what is really going on. Others may prefer one that is simpler to implement: Alternate grammar: pp-number: pp-floating-constant pp-number digit pp-number nondigit pp-number . pp-floating-constant: digit . digit pp-floating-constant . pp-floating-constant digit pp-floating-constant e sign pp-floating-constant E sign Note that in pathological cases, these differ. Consider: 1.1.e+5 ---------------------------------------------------------------- Further notes on Doug's comments: | P.S. I don't think the committee was "too tired of arguing to | do anything about it". More likely the review subgroup that | tackled your comments didn't fully understand the problem. If I've | correctly summarized it in the previous paragraph, then try an | argument along those lines in your re-response. As I understand it, most committee members saw most comments for the first time during the meeting (I am a member; I got only the early comments in a mailing). Since the meeting is a very busy period, most comments could not have been read by very many committee members, and certainly not read very carefully. | P.P.S. I was the only committee member who voted against sending | out the revised draft for the third public review, on the grounds | that there had been insufficient time allotted to study second- | round comments before responses were required. This may be an | example of that. I do think the committee did a remarkably good | job under the [self-imposed] circumstances. I think that you put it well, and very diplomatically (perhaps too diplomatically). Hugh Redelmeier {utcsri, utzoo, yunexus, hcr}!redvax!hugh In desperation: hugh@csri.toronto.edu +1 416 482 8253