Path: utzoo!attcan!uunet!husc6!mailrus!tut.cis.ohio-state.edu!ucbvax!HOGG.CC.UOREGON.EDU!jqj From: jqj@HOGG.CC.UOREGON.EDU Newsgroups: comp.protocols.tcp-ip Subject: Re: SO_KEEPALIVE considered harmful? Message-ID: <8906012051.AA04340@hogg.cc.uoregon.edu> Date: 1 Jun 89 20:51:23 GMT Sender: daemon@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 20 Seems to me that much of this discussion is missing the point that an open TCP connection (especially a telnet session) can tie up expensive resources on the server; most of the recent discussion has focussed on the problems of a user who may or may not want to abort a connection on network or remote host failure. For example, many timesharing systems charge based on "connect time", and some even enforce a maximum number of outstanding sessions. In such cases it is in the interest of the user and the system to abort a telnet session if there is reason to believe that loss of connectivity is not just briefly transient. One can obviously do this with a (perhaps user settable) timeout, but are there other heuristics that might usefully be used as well? Does anyone have any data on the distribution of time-length of network partitions? How, for that matter, might we define a network partition? Many events (e.g. the TR card in our NSS going bad) yield obvious network partitions with well defined lengths. Others, e.g. a degraded quality line, may imply very short (a few ms or s) partitions, which increase the errors and retransmissions and ultimately imply an unusable TCP connection. Can we come up with an analytic model that includes both sorts of failures?