Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!ncar!tank!mimsy!chris From: chris@mimsy.umd.edu (Chris Torek) Newsgroups: comp.std.c Subject: Re: integer value of multi-char constants Message-ID: <20209@mimsy.umd.edu> Date: 17 Oct 89 13:14:11 GMT References: <29588@gumby.mips.COM> <20205@mimsy.umd.edu> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 50 >In article <29588@gumby.mips.COM> lai@mips.COM (David Lai) writes: >>on a mips, vax, and sun the following is true: >> '\001\377' == '\000\377'; >>however on the same machines: >> '\001\177' != '\000\177'; >>The question is: does the above behaviour conform to ANSI C? In article <20205@mimsy.umd.edu> I wrote: >Certainly. The more important question is `why would anyone expect >otherwise?' Oops, for whatever reason I read the first line as '\000\377' == '\000\377' However, the results are still easily ( :-) ) explained. '\377' is shorthand for -1, and the compiler expands multicharacter constant values as follows (simplified: \ processing hidden): case '\'': if ((value = nextc()) == STOP) error("no characters in character constant"); while ((c = nextc()) != STOP) value = (value << 8) | nextc(); So '\001\377' computes as value = 1; /* \001 */ c = -1; /* \377 */ value = (1 << 8) | -1; c = STOP; /* ' */ /* value = -1 */ while '\000\377' computes as value = 0; /* \000 */ c = -1; /* \377 */ value = (0 << 8) | -1; c = STOP; /* ' */ /* value = -1 */ If the compiler added the values, rather than ORing them, the results would be different (and very peculiar). Probably the compiler should not sign extend unless the character constant contains only a single character. -- `They were supposed to be green.' In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris