Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ncar!husc6!ddl From: ddl@husc6.harvard.edu (Dan Lanciani) Newsgroups: comp.bugs.2bsd Subject: Re: TIME_WAIT sockets clog system (part 2) Summary: possible side effects Keywords: ftp mbufs time_wait drain Message-ID: <2144@husc6.harvard.edu> Date: 30 Jun 89 16:27:10 GMT References: <33132@wlbr.IMSD.CONTEL.COM> Organization: Harvard University, Cambridge MA Lines: 40 There is a potential problem with the proposed modifications to the mbuf allocator. The "cantwait" argument to the allocation routines is more than just an indication of whether it is ok to sleep. It is also a subtle hint that the routine was not called from a (potentially plimp) interrupt routine. If an allocation routine is called from, e.g., the ethernet interrupt it would be incorrect to (re)enter the network at plnet because (1) the network code may inadvertently lower the pl and (2) the network code may itself have been interrupted at a critical section. Unfortunately, to be useful, most *_drain routines must ultimately access global structures which are protected only by splnet's and thus are eventually likely to cause corruption. It may be possible to check the current pl on entry to the allocator (tricky because of the macros) and distinguish three classes of request (from least to most restrictive): task-time (may sleep and/or call the network), interrupt-time where previous pl was less than plnet (may call the network), and interrupt-time where previous pl was plnet or higher. Of course, since you cannot find the true previous pl, you would have to assume that the current pl is higher than the previous and work from there. This scheme breaks down if any plnet code calls the allocator in the expectation that it won't reenter, but such cases could be fixed. There is a somewhat different approach to the mbuf problem which might also be helpful. History: The allocator in 2.9 ignored the "cantwait" argument--it never slept at all. Needless to say, problems arose frequently. My first attempt to improve the situation was to add appropriate sleeps, making the system run much the way 4.3 does now, i.e., top-level code can wait for mbufs but most other code can't. This helped very little. Typically, a send call would block in the allocator until an mbuf became available and then call the tcp send routine which would promptly request an mbuf "DONTWAIT" and fail. The solution was to make allocation requests which *could* wait *always* wait until some fraction (say, 50%) of mbufs were free. This is effect reserved half of the mbufs for code that couldn't wait and improved matters significantly. Dan Lanciani ddl@harvard.*