Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site onfcanim.UUCP Path: utzoo!watmath!watnot!watcgl!onfcanim!dave From: dave@onfcanim.UUCP (Dave Martindale) Newsgroups: net.unix-wizards Subject: Re: minor bug in dump, can cause system to hang. Message-ID: <14694@onfcanim.UUCP> Date: Mon, 30-Sep-85 15:14:23 EDT Article-I.D.: onfcanim.14694 Posted: Mon Sep 30 15:14:23 1985 Date-Received: Wed, 2-Oct-85 00:27:14 EDT References: <1763@brl-tgr.ARPA> Reply-To: dave@onfcanim.UUCP (Dave Martindale) Organization: ONF, Montreal Lines: 60 In article <1763@brl-tgr.ARPA> davet@RAND-UNIX.ARPA (Dave Truesdell) writes: >Index: dump/dumptraverse.c--etc > >Description: > Dump would hang the system while doing incremental dumps (levels > 0), > during pass II. > > When doing dumps from a raw device, dump does not force all reads > to be done in multiples of DEV_BSIZE byte chunks. In most cases > drivers seem to handle this correctly, but one obscure case has > caused one of our systems (a VAX 11/785 running 4.2BSD) to hang. > > >Repeat-By: > Arrange for a raw read (size != n*512) of a directory to fail. > > In our case, a empty directory (containing ".", and "..") occupied > a block which was forwarded by the HP driver. When dump attempted to > read the directory entry ( 24 bytes long ), the system hung. > >Fix: > > Force bread to do all reads in multiples of DEV_BSIZE byte blocks. > However, for efficiency, I have added a seperate version of bread > called raw_bread, that is used in pass II. But why fix dump when the problem is almost certainly in the disk driver? Looking at the code in hp.c, we see that when a bad block is forwarded, the replacement block is always read in its entirety, even if you asked for less data, and thus the Massbus adapter has been set up to correctly map only the lesser amount of data. The fix would seem to be easy, just make the following change to the BSE case of the switch in hpecc(): *** /tmp/hp.c Mon Sep 30 14:57:02 1985 --- hp.c Mon Sep 30 15:05:32 1985 *************** *** 1024,1030 sn = bn%st->nspc; tn = sn/st->nsect; sn %= st->nsect; ! mbp->mba_bcr = -512; rp->hpof &= ~HPOF_SSEI; #ifdef HPBDEBUG if (hpbdebug) --- 1024,1030 ----- sn = bn%st->nspc; tn = sn/st->nsect; sn %= st->nsect; ! mbp->mba_bcr = -MIN(512, bp->b_bcount-(int)ptob(npf)); rp->hpof &= ~HPOF_SSEI; #ifdef HPBDEBUG if (hpbdebug) Now, I don't have a filesystem that has a directory in a bad block (as far as I know), so I can't test this under the same conditions. But the old code is clearly wrong. Why not fix the hp driver and then try the original version of dump and see if everything works as it should? By the way, exactly the same bug appears in the "up" Unibus disk driver too. Brought to you by Super Global Mega Corp .com