Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!think!harvard!cmcl2!philabs!nyit!rick From: rick@nyit.UUCP (Rick Ace) Newsgroups: net.unix-wizards Subject: Re: inode table full Message-ID: <227@nyit.UUCP> Date: Mon, 24-Mar-86 08:55:13 EST Article-I.D.: nyit.227 Posted: Mon Mar 24 08:55:13 1986 Date-Received: Wed, 26-Mar-86 04:47:56 EST References: <1938@brl-smoke.ARPA> Organization: NYIT Computer Graphics Lab., Old Westbury, N.Y. Lines: 74 Barry Shein writes, > There's definitely an inode bug in the original 4.2 tape distribution. > Not sure what the fix is tho it's been discussed a few times on this > list. The way to find out if that is your problem is to use pstat > to determine if any inodes have a ref count of -1 (will, I believe, > appear as ff or 255 on the output of pstat.) If so, you got it and > I believe it can lead to inode table full messages. Temporary fix? > Re-boot and pray for the best... Yes, there are several bugs, all of which revolve around the subject of file descriptor management and its interaction with devices such as terminals, whose open and close routines can sleep at a priority greater than PZERO. Consider the case where a 4.2bsd user program issues a close() syscall. The kernel can (not necessarily in this order): 1. Free the file descriptor (clear u.u_ofile[fd] and u.u_pofile[fd]). 2. Decrement the f_count value for the corresponding "file" table entry. If the count goes to zero, release the entry. 3. Decrement the i_count value for the "inode" structure. 4. In the case of a character-special device like a tty, call the driver's d_close routine. Problem: there are cases where the kernel gets halfway through doing an open() or close(), sleeps > PZERO, and gets interrupted by a signal before the rest of the operation is complete, leaving file/inode/user tables in an inconsistent state. One scenario that can cause i_count to go below zero goes like this: A user program calls close() to close a tty file descriptor. UNIX decrements f_count and i_count and then calls ttyclose(). If the tty's output character queue is not empty, the kernel sleep()s at a priority greater than PZERO, waiting for the queue to drain. Normally, once the queue has drained, the kernel awakens and proceeds to clear the u_ofile and u_pofile entries for the file descriptor. Assume, though, that while the process is sleep()ing on t_outq, it receives a signal. The kernel aborts the sleep AND NEVER CLEARS U_OFILE AND U_POFILE. When the process subsequently issues another close() call to that file descriptor (either explicitly, or implicitly via the "exit" syscall), f_count and i_count are decremented AGAIN, SPURIOUSLY. i_count can fall below zero, behaving like a very large count that will never reach zero. Result: jammed inode till next reboot. The kernel performs two main tasks during close(): 1. Adjust all share counts on "inode" and "file" table entries, freeing these entries when appropriate. 2. Call device-specific logic to close the device. When the kernel calls the device's d_close routine, it assumes the risk that the routine will sleep and be interrupted by a signal. It is therefore imperative that the kernel do either: all of #1 followed by all of #2, or all of #2 followed by all of #1. 4.2bsd begins some of the work in #1, then does #2, and finally finishes #1, giving rise to the bugs. There are places where a process can reference "file" table entries it does not own anymore. The essence of our fix was to rearrange the kernel's close() logic to do task #1 completely first, and then do task #2. It is possible in this case for close() to return an EINTR error code while closing a tty file descriptor, even though u_ofile and u_pofile have been cleared. This seems preferable to the other alternative (#2, followed by #1) because most programs don't examine the value returned by close(). ----- Rick Ace Computer Graphics Laboratory New York Institute of Technology Old Westbury, NY 11568 (516) 686-7644 {decvax,seismo}!philabs!nyit!rick