Path: utzoo!attcan!uunet!auspex!guy From: guy@auspex.UUCP (Guy Harris) Newsgroups: comp.lang.c Subject: Re: thanks for "down" answers Message-ID: <685@auspex.UUCP> Date: 12 Dec 88 17:47:12 GMT References: <9142@smoke.BRL.MIL> Reply-To: guy@auspex.UUCP (Guy Harris) Organization: Auspex Systems, Santa Clara Lines: 51 >Severeal respondents have pointed out that many compilers would NOT accept > (char_var = getchar()) != EOF >because getchar() returns an integer, EOF may be a negative integer, and >on many compilers char variables may not accept signed integers. Well, actually, most compilers will accept it, which is the problem - it'll pass the compiler without complaint, but *still* not work on machines where "char" is unsigned. And, frankly, it may not work on machines where "char" is signed, either; the problem is that "getchar()", on a machine with 8-bit bytes, can return either 1) a value in the range 0 to 255, which represents a character read from the standard input or 2) EOF, usually -1, which represents an end-of-file condition. The intent is that EOF not be a value in the range 0 to 255 (some implementation may give it such a value, but that merely means the implementor didn't know what they were doing). On a machine with unsigned "char"s, and 8-bit bytes, a "char" can have a value in the range 0 to 255. If EOF is -1, assigning EOF to a "char" on a 2's complement machine gives the value 255, which does not compare equal to EOF (-1). On a machine with signed "char"s, and 8-bit bytes, a "char" can have a value in the range -128 to 127. If EOF is -1, assigning EOF to a "char" gives the value -1 - but then, on a 2's complement machine, so does assigning the value 255. This means that if you read a character from the file with the hex value 0xFF - which is "y with a diaresis" in ISO Latin #1, so even in a pure text file you can have such a character - it will look just like an EOF. >I have entirely missed that point. This is how I was shown and taught. Oh dear. Sounds like the person who taught you needs a little remedial education; could you please point out to them that assigning the result of "getchar()" to a "char" variable is incorrect? >I have directly asked a couple of the especially kind respondents on >their way of handling this. If you have an unusual excellent >suggestion I would be most glad to read about it. There's only one valid suggestion, and that's to have the variable to which the value of "getchar()" is assigned be of some signed integral type larger than "char"; "int" is the best choice ("short" might work on some implementations, possibly most implementations, but it's wisest not to fool Mother Nature; "long" will work, but it's overkill and may be inefficient).