Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ncar!husc6!ddl
From: ddl@husc6.harvard.edu (Dan Lanciani)
Newsgroups: comp.bugs.2bsd
Subject: Re: TIME_WAIT sockets clog system (part 2)
Summary: possible side effects
Keywords: ftp mbufs time_wait drain
Message-ID: <2144@husc6.harvard.edu>
Date: 30 Jun 89 16:27:10 GMT
References: <33132@wlbr.IMSD.CONTEL.COM>
Organization: Harvard University, Cambridge MA
Lines: 40


	There is a potential problem with the proposed modifications
to the mbuf allocator.  The "cantwait" argument to the allocation
routines is more than just an indication of whether it is ok to sleep.
It is also a subtle hint that the routine was not called from a
(potentially plimp) interrupt routine.  If an allocation routine
is called from, e.g., the ethernet interrupt it would be incorrect
to (re)enter the network at plnet because (1) the network code may
inadvertently lower the pl and (2) the network code may itself have
been interrupted at a critical section.
	Unfortunately, to be useful, most *_drain routines must
ultimately access global structures which are protected only by
splnet's and thus are eventually likely to cause corruption.  It
may be possible to check the current pl on entry to the allocator
(tricky because of the macros) and distinguish three classes of
request (from least to most restrictive):  task-time (may sleep
and/or call the network), interrupt-time where previous pl was
less than plnet (may call the network), and interrupt-time where
previous pl was plnet or higher.  Of course, since you cannot
find the true previous pl, you would have to assume that the current
pl is higher than the previous and work from there.  This scheme
breaks down if any plnet code calls the allocator in the expectation
that it won't reenter, but such cases could be fixed.
	There is a somewhat different approach to the mbuf
problem which might also be helpful.  History:  The allocator
in 2.9 ignored the "cantwait" argument--it never slept at all.
Needless to say, problems arose frequently.  My first attempt
to improve the situation was to add appropriate sleeps, making
the system run much the way 4.3 does now, i.e., top-level
code can wait for mbufs but most other code can't.  This helped
very little.  Typically, a send call would block in the
allocator until an mbuf became available and then call the
tcp send routine which would promptly request an mbuf "DONTWAIT"
and fail.  The solution was to make allocation requests which
*could* wait *always* wait until some fraction (say, 50%) of
mbufs were free.  This is effect reserved half of the mbufs
for code that couldn't wait and improved matters significantly.

				Dan Lanciani
				ddl@harvard.*