Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site sauron.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!ittatc!dcdwest!sdcsvax!ncr-sd!ncrcae!sauron!wescott
From: wescott@sauron.UUCP (Michael Wescott)
Newsgroups: net.bugs.2bsd
Subject: Re: Bug in umount code
Message-ID: <623@sauron.UUCP>
Date: Wed, 19-Mar-86 15:10:38 EST
Article-I.D.: sauron.623
Posted: Wed Mar 19 15:10:38 1986
Date-Received: Sat, 22-Mar-86 05:42:18 EST
References: <1307@mit-eddie.MIT.EDU>
Reply-To: wescott@sauron.UUCP (Michael Wescott)
Organization: NCR Corp., Advanced System Development, Columbia, SC
Lines: 34
Keywords: umount block cache slow devices
Summary: actually a race condition

In article <1307@mit-eddie.MIT.EDU> jfw@mit-eddie.MIT.EDU (John Woods) writes:
>There appears to be a bug in umount on slow devices (and in principle on
>faster ones).  umount invalidates the entries in the buffer cache by setting
>their b_dev field to -1.  However, slow devices like RK05s and RX02s, when
>they finally get around to looking at these entries some time later, find
>that they point to non-existent devices (like minor device 7 in the case of
>the RK05).  Is there a standard fix for this problem?

We found a similar problem in an early v7 port purchased for MC68000.  The
umount() call works (roughly speaking since I'm not looking at the code) by
first calling update(), the kernel's internal version of sync.  This is to
insure that the buffer cache has been flushed prior to proceeding with the
umount().  Umount() eventually marks the buffers invalid in this case by
changing the b_dev to -1.

There is, however, a race condition here.  Update(), when called, first checks
to see if there already is a sync in progress.  If so it returns immediately.
In this case, umount() can proceed to put -1 into the b_dev field and when
the system tries to look this number up in the bdevsw table...

In our system it was a kernel bus error, "panic: ..."

Any fix must still mark the buffer invalid otherwise the system can get very
confused when a disk is unmount and a new one mounted soon thereafter. We
eventually fixed it by using the SysIII approach.  Put another bit into the
flags field of the buffer header that indicates that the data is no longer
valid when read.  Then the appropriate checks have to go into routines in
bio.c.  Then the b_dev is no longer invalid.

Another approach would be to have update() sleep if another update() is in
progress.  Then, upon wakeup just return, or continue and do it again.

	-Mike Wescott
	ncrcae!wescott