Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utcs!mnetor!seismo!rochester!stuart From: stuart@rochester.UUCP Newsgroups: net.unix-wizards Subject: Re: Undocumented behavior of select(2) (long - 121 lines) Message-ID: <19476@rochester.ARPA> Date: Thu, 17-Jul-86 08:38:46 EDT Article-I.D.: rocheste.19476 Posted: Thu Jul 17 08:38:46 1986 Date-Received: Fri, 18-Jul-86 01:07:57 EDT References: <154@nbc1.UUCP> Distribution: net Organization: U of Rochester, CS Dept., Rochester, NY Lines: 115 Summary: exceptional conditions for select are poorly handled In article <154@nbc1.UUCP>, abs@nbc1.UUCP (Andrew Siegel) writes: > I've encountered some undocumented behavior of select(2) [...] > So here's the problem: when one of these clients dies or shuts > down its end of the socket, the select in the server returns with > the bit in readfds set for the descriptor for the server's end of > that socket. Doing an ioctl(fd,FIONREAD,&n), where fd is that > file descriptor, yields zero bytes pending on that socket! So > there is a conflict: select says there are bytes pending, and > ioctl says there are none. If I do a read on that fildes, the > read returns 0 (EOF), conforming to the documented behavior of > read(2). > Andrew Siegel, N2CN NBC Computer Imaging, New York, NY > philabs!nbc1!abs (212)664-5776 Mini Answer: 1) Whenever a selected file descriptor changes state in any way, select will wake up. If the state change was an error condition or anything related to the status of a device or connection, both the input AND the output masks will be filled in. 2) If a device or connection has been closed (on the other end) the appropriate thing to do is close it on your end. Unless you do, select will continue to tell you about it. (And you will continue to tie up resources) 3) There is no good way to find out what exactly happened to the file descriptor in general. Although the FIONREAD ioctl gives useful information, you can't find out exactly what the new condition is unless you try to read or write from it. If it's an error condition, the read or write will return -1, and errno will tell you why. But you can't find out without trying to do the IO. This is awkward. Main Answer: (80 more lines, stop now if the mini answer satisfied you) It is my belief that the original designers of select did not intend this to happen but that *every* implementation of 4.2-style select behaves in this way. At a minimum, BSD 4.2 and 4.3 do, as do the Sun 2.0 and 3.0 releases (based on BSD 4.2) For programs with simple control structure this "feature" of select is not too bad a problem. For complicated programs that are trying to be robust against failure (that means we don't just die when we get an error, we identify it and try to do something about it, like maybe go out and locate a secondary server), it becomes a pain in the neck. While I'm showing my irritation in public, I also wish that stat reported something useful (ie, not a zeroed buffer) for sockets. After two major releases with sockets (4.2, 4.3), why doesn't it? But even stat doesn't tell you everything you'd like to know. I'd like to be able to get at the connection status (which is protocol/ device dependent) and the error status (which generally isn't) and I can't get at either of those without trying to do IO. Let me quote two paragraphs from a report Derek Pitcher and I wrote recently: "The situation is slightly more complicated than just described for three reasons. First, the designers of select made provision for a third bit mask. This mask was intended for "exceptional conditions" on a file descriptor or a socket. This mask has never been implemented. Instead, when an exceptional condition occurs, it matches either the input or output masks, if they happen to be set. This is annoying, because it means we can not completely trust select when it tells us that input is available. When the router finally does the select on the appropriate socket, it has to be prepared for an error condition instead of input. Worse, there is no way to test a socket for an error condition other than trying to read or write on it. [...] "In programming the router we had to detect and handle enough exceptional conditions to fervently wish that select had been fully implemented. We would like to treat events like "new connection available", "pending connection established", and "existing connection dropped or refused" differently from the routine operations of forwarding data between users. [...]" S.A. Friedberg & D.H. Pitcher Hierarchical Process Composition Project Report 3 "HPC IPC Implementation -- Unmodified UNIX Host Version" Computer Science Department University of Rochester Rochester, New York, 14627, USA This "feature" affects more than sockets. Unfortunately, I no longer have a copy of the first article, but Rich Burridge pointed out recently (7 July 86) that pseudo-terminals are also affected. Here is a heavily cut version of his comments: "Say I used 'cat' to redirect a small file to /dev/ttyq5 [...] nfds returned a 1 to indicate that there was a file descriptor with outstanding data to be read, and the readmask had the appropriate bit set. [...] when I come to do the select call again, it should return 0 and the readmask should also be zero. But no, it tells me that there is data to be read, and when I use the read call as above, it returns -1 in nread with errno set to 5 (EIO I/O error). [...] Steve Schoch at the NASA Ames Research Center provides the solution: [...] after the 'cat' is done writing the file to /dev/ttyq5, it exits, which closes that tty. When you close a tty file for the last time it causes it to hang up. If this was a real tty, it would drop DTR, but on a pseudo tty, it show that it is hung up by having the read fail on the master side." Rich Burridge, <3@yarra.OZ>, <6@yarra.OZ> Stuart Friedberg {seismo, allegra}!rochester!stuart stuart@rochester