Path: utzoo!attcan!uunet!husc6!mailrus!tut.cis.ohio-state.edu!ucbvax!HOGG.CC.UOREGON.EDU!jqj
From: jqj@HOGG.CC.UOREGON.EDU
Newsgroups: comp.protocols.tcp-ip
Subject: Re:  SO_KEEPALIVE considered harmful?
Message-ID: <8906012051.AA04340@hogg.cc.uoregon.edu>
Date: 1 Jun 89 20:51:23 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Organization: The Internet
Lines: 20

Seems to me that much of this discussion is missing the point that an
open TCP connection (especially a telnet session) can tie up expensive
resources on the server; most of the recent discussion has focussed on
the problems of a user who may or may not want to abort a connection on
network or remote host failure.  For example, many timesharing systems
charge based on "connect time", and some even enforce a maximum number
of outstanding sessions.  In such cases it is in the interest of the
user and the system to abort a telnet session if there is reason to
believe that loss of connectivity is not just briefly transient.  One
can obviously do this with a (perhaps user settable) timeout, but are
there other heuristics that might usefully be used as well?

Does anyone have any data on the distribution of time-length of network
partitions?  How, for that matter, might we define a network
partition?  Many events (e.g. the TR card in our NSS going bad) yield
obvious network partitions with well defined lengths.  Others, e.g. a
degraded quality line, may imply very short (a few ms or s) partitions,
which increase the errors and retransmissions and ultimately imply an
unusable TCP connection.  Can we come up with an analytic model that
includes both sorts of failures?