Path: utzoo!attcan!uunet!ginosko!aplcen!haven!mimsy!chris From: chris@mimsy.UUCP (Chris Torek) Newsgroups: comp.unix.wizards Subject: Re: Ever seen nondeterministic a.out execution from some filesystems? Message-ID: <20057@mimsy.UUCP> Date: 8 Oct 89 15:20:24 GMT References: <11827@watcgl.waterloo.edu> Distribution: comp Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 161 In article <11827@watcgl.waterloo.edu> idallen@watcgl.waterloo.edu writes: >File system /tmp on our 4.3BSD vax8600's has a block size equal to its >frag size equal to 8192. ... If I compile ... ten times in a row, half >the time the resulting a.out won't run. Copying the a.out to another >file in /tmp often fixes the problem. Copying the file to another file >system and running it from there always fixes the problem. ... If I run >a faulting a.out under adb, it will fault and when I examine instructions >near where it faults I see zeroes! Sounds like munhash() is either not being called properly, or not doing its job. There was a small change to realloccg() between 4.3BSD and 4.3BSD-tahoe, along the following lines: [old] count = roundup(osize, CLBYTES / DEV_BSIZE); for (i = 0; i < count; i += CLBYTES / DEV_BSIZE) ... munhash(..., bn + i); [new] count = roundup(osize, CLBYTES); for (i = 0; i < count; i++) ... munhash(..., bn + i * CLBYTES / DEV_BSIZE); As far as I can tell, this change has no actual effect (on both Vax and Tahoe). Also, with fsize==bsize, realloccg() should not be called at all since there are no fragments. The other likely possiblity is the buffer size-changing code, for which a fix was posted from Berkeley. Here is a version of that fix. *** /tmp/,RCSt1003260 Sun Oct 8 11:19:12 1989 --- ufs_bio.c Tue Nov 8 00:19:24 1988 *************** *** 4,8 **** * specifies the terms and conditions for redistribution. * ! * @(#)ufs_bio.c 7.1 (Berkeley) 6/5/86 */ --- 4,8 ---- * specifies the terms and conditions for redistribution. * ! * @(#)ufs_bio.c 7.3 (Berkeley) 11/12/87 */ *************** *** 34,38 **** panic("bread: size 0"); bp = getblk(dev, blkno, size); ! if (bp->b_flags&B_DONE) { trace(TR_BREADHIT, pack(dev, size), blkno); return (bp); --- 34,38 ---- panic("bread: size 0"); bp = getblk(dev, blkno, size); ! if (bp->b_flags&(B_DONE|B_DELWRI)) { trace(TR_BREADHIT, pack(dev, size), blkno); return (bp); *************** *** 68,72 **** if (!incore(dev, blkno)) { bp = getblk(dev, blkno, size); ! if ((bp->b_flags&B_DONE) == 0) { bp->b_flags |= B_READ; if (bp->b_bcount > bp->b_bufsize) --- 68,72 ---- if (!incore(dev, blkno)) { bp = getblk(dev, blkno, size); ! if ((bp->b_flags&(B_DONE|B_DELWRI)) == 0) { bp->b_flags |= B_READ; if (bp->b_bcount > bp->b_bufsize) *************** *** 85,89 **** if (rablkno && !incore(dev, rablkno)) { rabp = getblk(dev, rablkno, rabsize); ! if (rabp->b_flags & B_DONE) { brelse(rabp); trace(TR_BREADHITRA, pack(dev, rabsize), blkno); --- 85,89 ---- if (rablkno && !incore(dev, rablkno)) { rabp = getblk(dev, rablkno, rabsize); ! if (rabp->b_flags & (B_DONE|B_DELWRI)) { brelse(rabp); trace(TR_BREADHITRA, pack(dev, rabsize), blkno); *************** *** 150,159 **** register struct buf *bp; { - register int flags; if ((bp->b_flags&B_DELWRI) == 0) u.u_ru.ru_oublock++; /* noone paid yet */ ! flags = bdevsw[major(bp->b_dev)].d_flags; ! if(flags & B_TAPE) bawrite(bp); else { --- 150,157 ---- register struct buf *bp; { if ((bp->b_flags&B_DELWRI) == 0) u.u_ru.ru_oublock++; /* noone paid yet */ ! if (bdevsw[major(bp->b_dev)].d_flags & B_TAPE) bawrite(bp); else { *************** *** 261,264 **** --- 259,269 ---- * for the oldest non-busy buffer and reassign it. * + * If we find the buffer, but it is dirty (marked DELWRI) and + * its size is changing, we must write it out first. When the + * buffer is shrinking, the write is done by brealloc to avoid + * losing the unwritten data. When the buffer is growing, the + * write is done by getblk, so that bread will not read stale + * disk data over the modified data in the buffer. + * * We use splx here because this routine may be called * on the interrupt stack during a dump, and we don't *************** *** 306,309 **** --- 311,323 ---- splx(s); notavail(bp); + if (bp->b_bcount != size) { + if (bp->b_bcount < size && (bp->b_flags&B_DELWRI)) { + bp->b_flags &= ~B_ASYNC; + bwrite(bp); + goto loop; + } + if (brealloc(bp, size) == 0) + goto loop; + } if (bp->b_bcount != size && brealloc(bp, size) == 0) goto loop; *************** *** 365,369 **** /* ! * First need to make sure that all overlaping previous I/O * is dispatched with. */ --- 379,383 ---- /* ! * First need to make sure that all overlapping previous I/O * is dispatched with. */ *************** *** 502,505 **** --- 516,522 ---- /* * Insure that no part of a specified block is in an incore buffer. + #ifdef SECSIZE + * "size" is given in device blocks (the units of b_blkno). + #endif SECSIZE */ blkflush(dev, blkno, size) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris