Xref: utzoo comp.sys.att:5487 unix-pc.general:2206
Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!rutgers!att!mtunb!jcm
From: jcm@mtunb.ATT.COM (was-John McMillan)
Newsgroups: comp.sys.att,unix-pc.general
Subject: Re: From blocks to files (on a UNIXpc)
Message-ID: <1392@mtunb.ATT.COM>
Date: 9 Feb 89 14:54:38 GMT
References: <462@manta.pha.pa.us>
Reply-To: jcm@mtunb.UUCP (was-John McMillan)
Organization: AT&T ISL Middletown NJ USA
Lines: 76

In article <462@manta.pha.pa.us> brant@manta.pha.pa.us (Brant Cheikes) writes:
>Given a block number, how can I find out (a) if it's part of a file,
>and (b) what file it's part of?


A)  There are so many uses of BLOCK NUMBER (and representations thereof) I
    will simply PRESUME you are referring to:
    	A LOGICAL BLOCK # on an identified FILE-SYSTEM.
    
    For this case:
    	As root, run:
    		/etc/ncheck  -i  ####  -a  /dev/rfp###
    	(per instructions in Section 1M).
    
    The above will give you the mount-point-relative path-names of
    all files which contain the block[s].  (Don't bug me: I know
    that for most of you this is the FULL path name, but NOT for me!)

    (I've also seen at three other representations of block #s based
    on physical drive offsets (using PHYSICAL BLOCK #s, I presume)).

B)  The problem MAY be addressable WITHOUT EITHER bad-blocking or
    re-formatting.

    1)	Blocks contain META-information, and data.

    2)	META-stuff includes sector id's and synchronization fields.
	If META-merde is blown, only reformatting will fix.
	(In a kinder, gentler world, SINGLE-TRACK reformatting --
		with NO loss of other sectors -- would be available.)

    3)  The only sensed errors are data READ errors.  
    	These errors reflect either transient read (noise/vibration)
		problems, or unrecoverable read problems:
	Unrecoverable read problems arise from either transient write
		(signal/vibration) problems or from permanent (surface
		defect) problems.
	In general, the system silently re-tries enough times you
		aren't aware of transient READ errors.
	In my experience, a LARGE percent of "un-recoverable read errors"
		are of the TRANSIENT write-error type.
	Transient write problems may be corrected by re-writing the
		data block.
		
    4)	Therefore, I generally try to fix a disk by:
	a)  Identifying the file (or just using DD(1) to examine the
		entire disk, and then addressing the specific BLOCK).
	b)  Repeatedly trying to copy the bad file (or individual disk
		block) -- in the hope that the problem is an intermittent
		READ failure whose data may be salvaged.  (This
		usually fails, as the system has re-tried many times
		before you are aware of a problem.  But SOMETIMES!)
	c)  If the data was salvaged, I re-write the file/block and
		re-read several times to identify if the problem is
		repaired.
	d)  If the data was NOT salvaged, I write ZEROES into the
		file/block and re-read several times to identify if
		the block is readable.  The file is then scrapped.
		(If the file was in the INODE area, this produces
		anxiety & depression ;^)  (Hmmmm... I've never thought
		to try it, but I wonder if using RAW I/O, I could save
		HALF the bad LOGICAL [1K] block by doing this ZEROING
		on a PHYSICAL [512] block basis?  This could reduce
		INODE loss from 16- to 8-inodes.)
	
C)  Absurdly, I've never run any programs to augment the BAD-BLOCK
	list.  When I've lost sectors permanently, there has only been
	smoke & ashes left!  This, in part, reflects the higher
	reliability of the AT&T-accepted disks -- no joke here!

{ Tedious opinions of disk selection criteria deleted ;-) }

Anyway, FREE-LISTS are NOT the issue, since running "FSCK -s" will
rebuild them from scratch.  

jc mcmillan	-- att!mtunb!jcm	-- speaking for himself, if that