Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!tut.cis.ohio-state.edu!purdue!haven!mimsy!chris From: chris@mimsy.umd.edu (Chris Torek) Newsgroups: comp.unix.questions Subject: Re: Sockets and interrupt driven I/O Summary: netinet/tcp_input.c bug Keywords: socket, ASYNC, SIGIO Message-ID: <22919@mimsy.umd.edu> Date: 6 Mar 90 03:31:31 GMT References: <1011@m1.cs.man.ac.uk> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 67 In article <1011@m1.cs.man.ac.uk> HoldswoS@r4.cs.man.ac.uk (Sean Holdsworth) writes: >... if I write to a socket and get back an EWOULDBLOCK error I have no >way of knowing, other than polling, when the socket again becomes writable. >I had hoped that when the state of the socket changed from blocked to >unblocked that a SIGIO would be generated but from my experiments this >appears not to be the case. If you examine the file netinet/tcp_input.c, you will find code of the form: if (act > so->so_snd.sb_cc) { tp->snd_wnd -= so->so_snd.sb_cc; sbdrop(&so->so_snd, (int)so->so_snd.sb_cc); ourfinisacked = 1; } else { sbdrop(&so->so_snd, acked); tp->snd_wnd -= acked; ourfinisacked = 0; } --> if ((so->so_snd.sb_flags & SB_WAIT) || so->so_snd.sb_sel) sowwakeup(so); tp->snd_una = ti->ti_ack; The line marked `-->' above is the cuprit: For the sake of efficiency, the TCP code calls sowwakeup(so) (a macro for sowakeup(so, &so->so_snd)) only if a process is waiting for data to go out, or is selecting, or was selecting not long ago. When a process uses asynchronous I/O, however, neither SB_WAIT nor sb_sel are set. There are a number of ways around the problem. One of them involves no kernel changes: simply arrange for some process to select on that socket. Any process will do, including your own. Thus: omask = sigblock(sigmask(SIGIO)); if ((n = write(socket_fd, buf, count)) < 0 && errno == EWOULDBLOCK) { fd_set out; struct timeval tv; FD_ZERO(&out); FD_SET(socket_fd, &out); tv.tv_sec = 0; tv.tv_usec = 0; n = select(socket_fd + 1, (fd_set *)0, &out, (fd_set *)0, &tv); if (n > 0) { /* can write now, try again: must have unblocked while we were fiddling with select() */ n = write(socket_fd, buf, count); if (n < 0 && errno == EWOULDBLOCK) ... now what? ... } /* cannot write, but made so->so_snd.sb_sel non nil so that the tcp code will call sowakeup() later */ n = 0; /* clobber the error */ } if (n < 0) ... handle output error ... (void) sigsetmask(omask); Simpler fixes, if you have kernel source, are to remove the test from tcp_input.c or to assert SB_WAIT in sosend() when returning EWOULDBLOCK. (The latter would preserve efficiency, and protect any other code that has the same bug as tcp_input.c, such as the XNS code [spp_usrreq.c].) The bug has been fixed in 4.4BSD by removing the test from both netinet/tcp_input.c and netns/spp_usrreq.c. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris