Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!think.com!barmar From: barmar@think.com (Barry Margolin) Newsgroups: comp.unix.questions Subject: Re: reading on sockets when connection breaks Message-ID: <1990Dec6.055353.23846@Think.COM> Date: 6 Dec 90 05:53:53 GMT References: <25205@adm.brl.mil> Sender: news@Think.COM Organization: Thinking Machines Corporation, Cambridge MA, USA Lines: 41 In article <25205@adm.brl.mil> mnl%IDTSUN1.E-TECHNIK.TH-DARMSTADT.DE@BRL.MIL (Michael@CUNYVM.CUNY.EDU N. Lipp) writes: >I have a program that establishes a TCP-connection with another machine, >requests the server to send some packets of data and then does a > >while (read (fd, &packet, sizeof (packet)) == sizeof (packet)) { ... } > >This program hangs frequently. I made it QUIT and found it hanging in the >read. As this program frequently connects to diskless machines that are >switched off at night, I assume that the connection comes down while >the program is reading. > >I am wondering: shouldn't read return with an error status if the connection >breaks? As it apparently does not, what is the most reasonable fix? >A blocking read with timeout comes to my mind, but what is the best >way to do this? Are the diskless machines simply switched off, or are they shut down with software? If they're just switched off, then they won't be able to send the appropriate "close this connection" packets (either a FIN or a RST) on these connections. Unfortunately, there's no reliable way to determine whether another machine is up or down on many network media (Ethernet, in particular). Lack of communication can result from a number of other causes: network congestion, router/bridge failure, a flaky cable or connector, etc. If you're willing to assume that incommunicado means dead you can use a keepalive, an empty packet that is sent periodically in order to elicit an acknowledgement. If you're using Unix sockets, the SO_KEEPALIVE option can be enabled to automate this. By the way, there's another bug in your code, in the "== sizeof (packet)". Read() is permitted to return fewer bytes than you asked for; the third argument is only a maximum. You should use something like while ((count = read(...)) > 0) { ... } -- Barry Margolin, Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar