Path: utzoo!attcan!uunet!husc6!mailrus!tut.cis.ohio-state.edu!rutgers!mit-eddie!uw-beaver!uw-june!ka From: ka@june.cs.washington.edu (Kenneth Almquist) Newsgroups: comp.unix.wizards Subject: Re: stdio EOF Message-ID: <5697@june.cs.washington.edu> Date: 11 Sep 88 10:09:13 GMT References: <813@ms3.UUCP> <1246@mcgill-vision.UUCP> <669@super.ORG> <13427@mimsy.UUCP> Organization: U of Washington, Computer Science, Seattle Lines: 69 In article <13427@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: > In article <8422@smoke.ARPA> gwyn@smoke.ARPA (Doug Gwyn ) writes: >> ... In fact [stdio] EOF should not be "sticky"; if more data becomes >> available, as on a terminal, it should be available for subsequent >> reading. The 4.2BSD implementation broke this but it might be okay >> on 4.3BSD. > > I thought this behaviour was added to 4.2BSD to conform to some > existing standard. Berkeley conform to an existing standard? You must be kidding. The story I read on the net a few years ago is that Berkeley made this change to fix a problem with fread. The problem is that the fread documentation contradicts itself, stating both that, "fread returns the number of items actually read," and "fread returns 0 on end of file or error." What should fread do when its caller requests three items, but fread encounters and end of file after reading only two? The first sentence claims it should return two (the number of items read), while the second claims it should return zero (because end of file was encountered). Berkeley interpreted the documentation as indicating that fread should return two, but should then return zero on the next call. The obvious way to implement this would be to have fread do an ungetc on the EOF so that the next time it was called it would immediately read an EOF and return zero. However, ungetc does not allow an EOF to be pushed back onto the input. This deficiency of ungetc is (in my view) the biggest flaw in the design of the stdio library, and it makes it impossible to implement scanf correctly, so Berkeley would have done the world a favor by extending the stdio library to allow EOF to be pushed back. Instead, they chose a simpler approach: make getc always return EOF when the eof or error flags are set. This approach allowed them to fix the fread problem by writing only a couple of lines of code, but it also broke getc. In 4.2 BSD the behavior of getc is a bug since it disagrees with the documentation. In 4.3 BSD, Berkeley modified the documentation to agree with the code. ("It's not a bug, it's a feature!") By the way, AT&T also noticed the contradiction in the fread documentation. They fixed the documentation so that it clearly reflected the behavior of the code. This seems like a better approach since modifying the code to agree with the documentation doesn't make much sense when the meaning of the documentation is so unclear. In any case, AT&T's approach, unlike Berkeley's, didn't break working code. > What does the dpANS say? POSIX? I don't know, and how they resolve this issue is less important than that the issue is resolved. The standard I/O library is supposed to be *standard*; that's the whole point of it. There are, however, several reasons why they should prefer Dennis Ritchie's original definition of getc over Berkeley's: 1. Ritchie's definition has seniority. Berkeley's gratuitous change to getc was not made until 4.2 BSD and was not documented until 4.3 BSD. All other versions of UN*X use Ritchie's definition. 2. Aesthetics. Ritchie's definition can be stated in seven words: Return EOF when at end of file. 3. Authority. If anyone's opinion should be respected when setting UN*X standards, Ritchie's should be. Kenneth Almquist -- And there shall come among you false prophets, who will corrupt my teachings and teach that EOF should be sticky....