Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site sauron.UUCP Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!ittatc!dcdwest!sdcsvax!ncr-sd!ncrcae!sauron!wescott From: wescott@sauron.UUCP (Michael Wescott) Newsgroups: net.bugs.2bsd Subject: Re: Bug in umount code Message-ID: <623@sauron.UUCP> Date: Wed, 19-Mar-86 15:10:38 EST Article-I.D.: sauron.623 Posted: Wed Mar 19 15:10:38 1986 Date-Received: Sat, 22-Mar-86 05:42:18 EST References: <1307@mit-eddie.MIT.EDU> Reply-To: wescott@sauron.UUCP (Michael Wescott) Organization: NCR Corp., Advanced System Development, Columbia, SC Lines: 34 Keywords: umount block cache slow devices Summary: actually a race condition In article <1307@mit-eddie.MIT.EDU> jfw@mit-eddie.MIT.EDU (John Woods) writes: >There appears to be a bug in umount on slow devices (and in principle on >faster ones). umount invalidates the entries in the buffer cache by setting >their b_dev field to -1. However, slow devices like RK05s and RX02s, when >they finally get around to looking at these entries some time later, find >that they point to non-existent devices (like minor device 7 in the case of >the RK05). Is there a standard fix for this problem? We found a similar problem in an early v7 port purchased for MC68000. The umount() call works (roughly speaking since I'm not looking at the code) by first calling update(), the kernel's internal version of sync. This is to insure that the buffer cache has been flushed prior to proceeding with the umount(). Umount() eventually marks the buffers invalid in this case by changing the b_dev to -1. There is, however, a race condition here. Update(), when called, first checks to see if there already is a sync in progress. If so it returns immediately. In this case, umount() can proceed to put -1 into the b_dev field and when the system tries to look this number up in the bdevsw table... In our system it was a kernel bus error, "panic: ..." Any fix must still mark the buffer invalid otherwise the system can get very confused when a disk is unmount and a new one mounted soon thereafter. We eventually fixed it by using the SysIII approach. Put another bit into the flags field of the buffer header that indicates that the data is no longer valid when read. Then the appropriate checks have to go into routines in bio.c. Then the b_dev is no longer invalid. Another approach would be to have update() sleep if another update() is in progress. Then, upon wakeup just return, or continue and do it again. -Mike Wescott ncrcae!wescott