Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!think.com!mintaka!bloom-picayune.mit.edu!news From: scs@adam.mit.edu (Steve Summit) Newsgroups: comp.lang.c Subject: Why stderr should be unbuffered Message-ID: <1990Dec6.055052.10616@athena.mit.edu> Date: 6 Dec 90 05:50:52 GMT References: <1990Dec6.050817.9741@athena.mit.edu> Sender: news@athena.mit.edu (News system) Reply-To: scs@adam.mit.edu (Steve Summit) Organization: Thermal Technologies, Inc. Lines: 84 In article <1990Dec6.050817.9741@athena.mit.edu>, I wrote: >In article peter@ficc.ferranti.com (Peter da Silva) writes: >>In every case where I've got the source to a program and have figured out >>where something like that came from it's been one of the two following >>cases: >> if(!(fp = fopen(...))) { >> fprintf(stderr, "%s: ", argv[0]); >> perror(...); >> } >>(where the fprintf stomped on errno when deciding what buffering to do) or: >[other case deleted] > >This is a highly unfortunate case. I believe that stderr should >never be buffered; I'll discuss the tradeoffs in another article. Once upon a time, errno was always unbuffered, and stdout was always buffered. Enough new users, who didn't know about (or were too lazy to use) fflush, complained about missing output that line buffering was invented, and enabled if stdout wasatty. This may already have been a step in the wrong direction; line buffering merely coddles people into thinking they don't need to worry about fflush, and indeed code _usually_ works without it, yet the implicit buffering decisions are occasionally incorrect (e.g. when stdout is a pipe), so fflush is still a good idea, but people are even less likely to use it. Then someone had the brilliant idea to start buffering stderr. (Presumably it was important to speed the delivery of voluminous error messages.) This solved a non-problem at considerable risk: a user may now miss an important error message (or have it delayed) because of an inappropriate stderr buffering decision. (To be sure, ANSI X3.159 says that stderr must be at most line buffered, so messages may be delayed only if they do not end in newlines.) If inefficient voluminous stderr output was truly a problem, a better solution would have been to fix the offending voluminous error message generators. A program which can be expected to generate voluminous output to stderr on most runs (examples I know of are rcs, tar -v, cc, and lint) should be rewritten to place those messages on stdout. (It might seem inappropriate to place error messages on stdout, since the whole point of stderr is to keep error messages from disappearing down a stdout pipeline, but note that each of the programs I mentioned generates no output to stdout except in exceptional cases. For cc and lint, it can be argued that the error messages are simple output, and _should_ be placed on stdout, to make piping to filters or line printers easier, especially for unfortunate csh users. Indeed, many cc's probably do this already, and I'm fairly sure that lint does.) Since there may always be programs generating voluminous output to stderr, and since even moderate amounts of unbuffered output can be debilitating on heavily-loaded timesharing systems with expensive system calls, it may still be desired to reduce character-at-a-time error output. This can be done transparently, without buffering stderr and without requiring any effort by programmers to ensure correct output, with only a little work on the stdio implementor's part. A quick-and-dirty solution is to special-case stderr inside, say, fprintf, and temporarily buffer it. This causes (more) reentrancy problems, and required kludgey special handling for longjmp in one system under which I saw it implemented. A better solution is to sprinkle checks throughout the stdio code wherever multiple characters might be available at once, and for which putc's which translate to character-at-a-time write(2)'s would be pointlessly, unnecessarily expensive. I know of six such places: puts, fputs, fwrite, and three places within the printf common code (non-% format text, strings pulled in with %s, and all other formatted strings, which are typically constructed in a common buffer and printed by common code). To be sure, special cases sprinkled throughout code (not even nicely centralized) are generally undesirable, but stdio is widely-used and important enough that uglifications for performance reasons (can the eternal efficiency hack denouncer really be saying this? :-) ) are justifiable. I implemented such speedups for unbuffered I/O in my own stdio package and it was even less work than I thought it would be. Steve Summit scs@adam.mit.edu Brought to you by Super Global Mega Corp .com