Xref: utzoo comp.bugs.4bsd:1320 comp.bugs.2bsd:158
Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!usc!elroy.jpl.nasa.gov!mahendo!wlbr!sms@wlv.imsd.contel.com
From: sms@wlv.imsd.contel.com (Steven M. Schultz)
Newsgroups: comp.bugs.4bsd,comp.bugs.2bsd
Subject: TIME_WAIT sockets clog system
Keywords: ftp mbufs mget mput time_wait
Message-ID: <33437@wlbr.IMSD.CONTEL.COM>
Date: 4 Jul 89 06:59:19 GMT
Sender: news@wlbr.IMSD.CONTEL.COM
Reply-To: sms@wlv.imsd.contel.com (Steven M. Schultz)
Followup-To: comp.bugs.4bsd
Organization: Contel Federal Systems
Lines: 84

In article <12417@bloom-beacon.MIT.EDU> scs@adam.pika.mit.edu (Steve Summit) writes:

>There is an interesting discussion going on in comp.bugs.2bsd
>about an out-of-mbufs problem caused by an mget in ftp.  The
>problem obviously occurs primarily on a pdp11 with its limited
>memory, but the 2.10bsd code is taken directly from the VAX
>version, and I have noticed the same problem (and indeed the
>original submittor acknowledges the possibility in the excerpt
>from his posting I've reproduced below) when doing an mput (as I
>recall) on an overloaded MicroVAX being used as a file server.

	ahhh, so others have seen the problem on larger machines.  i had
	not seen any other references before, so i thought it only a
	'theoretical' possibility to run a vax out of network memory.

>There is some debate about the efficacy of the proposed fix,
>which involves fleshing out the (previously stubbed) tcp_drain
>routine.

	the pitfalls of my proposed change to the mbuf allocator
	have been made known to me (i really should have known better).
	an alternative solution is-being/has-been prepared.  

	a small change to mbuf.h is made, adding a new 'wait' flag
	and modifying the MGET macro to test whether it is safe
	(i.e. not being at splimp) to manipulate the tcb chain(s).
	the 0340 and 0100 are the processor priority mask and network
	priority (2) level for the pdp-11, but hopefully the idea is clear.
	ideally the appropriate symbolic names should be used, but
	"real work" reared it's head ;-)

	the idea is to add another state that will NOT sleep, but WILL
	invoke the drain code if the network code was at splnet.  (thanks
	to Dan Lanciani - ddl@harvard.harvard.edu for pointers in this
	area).

	it would be enlightening to know why sockets stay around so long
	in a TIME_WAIT state (especially on a LAN) and what would break
	if the timeout interval were reduced.

	the tcp_drain() modification with the removal
	of the un-necessary splimp call seems adequate.  here's what
	tcp_drain() looks like at the moment:

tcp_drain()
{
	register struct inpcb *ip, *ipnxt;
	register struct tcpcb *tp;

	/*
	 * Search through tcb's and look for TIME_WAIT states to liberate,
	 * these are due to go away soon anyhow and we're short of space or
 	 * we wouldn't be here...
	 */
	ip = tcb.inp_next;
	if (ip == 0)
		return;
	for (; ip != &tcb; ip = ipnxt) {
		ipnxt = ip->inp_next;
		tp = intotcpcb(ip);
		if (tp == 0)
			continue;
		if (tp->t_state == TCPS_TIME_WAIT)
			tcp_close(tp);
	}
}

	and the change to mbuf.h:

/* flags to m_get */
#define	M_DONTWAIT	0
#define	M_WAIT		1
#define	M_DONTWAITLONG	2		/* THIS IS NEW */
	...
#define	MGET(m, i, t) \
	{ int ms = splimp(); \
	  if ((m)=mfree) \
		{ if ((m)->m_type != MT_FREE) panic("mget"); (m)->m_type = t; \
		  mbstat.m_mtypes[MT_FREE]--; mbstat.m_mtypes[t]++; \
		  mfree = (m)->m_next; (m)->m_next = 0; \
		  (m)->m_off = MMINOFF; } \
	  else \
		(m) = m_more((((ms&0340) <= 0100) && (i==M_DONTWAIT)) ? M_DONTWAITLONG : i, t); \
	  splx(ms); }