Xref: utzoo unix-pc.general:7232 comp.sys.att:11483 Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!bu.edu!shelby!apple!fernwood!portal!cup.portal.com!thad From: thad@cup.portal.com (Thad P Floryan) Newsgroups: unix-pc.general,comp.sys.att Subject: Re: The 3B1 and the Bad Block Message-ID: <37972@cup.portal.com> Date: 13 Jan 91 08:36:28 GMT References: <1991Jan12.014524.300@cjsa.wa.com> Organization: The Portal System (TM) Lines: 409 jeff@cjsa.wa.com (Jeffery Small) in <1991Jan12.014524.300@cjsa.wa.com> writes: I have been getting a number of the following HDERR reports in /usr/adm/unix.log (long lines have been wrapped for readability): [A] HDERR ST:51 EF:40 CL:E3 CH:3 SN:7 SC:1 SDH:22 DMACNT:FFFF DCRREG:9A MCRREG:8F00 Thu Dec 27 11:11:00 1990 [B] WD2010 ST=/Sekg/Err/ EF=/CRC/ cy=995. sc=7. hd=2. dr#=0. MCR2:0x0 Thu Dec 27 11:11:04 1990 [C] drv:0 part:2 blk:58635 rpts:1 Fri Jan 11 07:19:14 1991 So my questions are: 1: Why didn't the block number in the error report (58635) work? What (probably obvious) idea am I missing and how should I properly fix this problem? The "blk:58635" is with respect to the THIRD HD partition "part:2" (counting from 0). You need to convert that to a block number with respect to the beginning of the HD for use with the bad-block mapper in s4diag. I have a program "hdhelp" that performs the calculation of the "real" block number given items [A] and/or [C] above from /usr/adm/unix.log along with the "-t" partitioning report data from "iv". The program is still in an "ALPHA" stage because I'm still "playing" with it to handle other error reports and to report the bad block in several different forms including byte-offset. And I have a thought "hdhelp" might be eventually adapted to "correct" the problem (by mapping out bad block(s)) "online" ... but there are potential nasties with this on a mounted file system and I haven't yet given any thought concerning strategy(ies) for doing this. To get the partition information needed, you can run "iv" su'd online thusly: # iv -t /dev/rfp000 or you can request the same information from one of s4diag's menus. I've included a "shar" of the present version 0.1 "hdhelp" at the end of this posting since it IS useful in its present form. NO DOCS are yet available but it should be easy to follow the code and the comments. I've already used it to calculate the bad blocks to be mapped-out on 5 systems, but the program will change considerably before the version 1.0 "official" release. One note: the program should be compiled and run on a system OTHER than the one with the problem! :-) I believe it'll even compile and run with the C on my C-64. 2: Did I just (potentially) hose some disk files by entering the two good sectors into the BBT or did the contents of these sectors get copied to the alternate track by the diagnostic routine? If the data was not copied, is there a way that I could determine the files (if any) which were damaged? Yep, you hosed 'em real good! I seriously doubt the s4diag bad-block mapper copies or zaps anything, so the information in the original blocks should still be intact. Given that you already know how to run "bf": So, I used Brant Cheikes' great "bf" program to determine that block number 58635 was allocated to inode #2583. ncheck then told me that this was currently assigned to my 1Mb-sized Cnews "history" file. you could do the same thing to determine the file(s) to which the blocks you did map out were assigned. In this case, you have to convert the partition 0 block number (which is what s4diag uses) to a partition 2 block number for bf to do its thing. This calculation is the INVERSE of what "hdhelp" does. I don't know if you have to un-bad-block-map the two "good" blocks you canned, but you can try it both ways. 3: Can I use the same diagnostic routine to recover the use of these two sectors by "Deleting" them from the BBT? If not, what is the "Delete" option for? Yes, you "should" be able to recover the two erroneously mapped-out blocks using the "Delete" option. One thing that "bothers" me with your posting is that you didn't indicate whether you mapped out both sectors of a logical block or whether you just mapped out by single sector. If you simply use "Delete" to undo what you originally did, you should be OK. But keep in mind that a 1K logical block on the 3B1 comprises two sectors (physical 512-byte blocks). Thad Floryan [ thad@cup.portal.com ] ---- Cut Here and feed the following to sh ---- #!/bin/sh # This is a shell archive (produced by shar 3.49) # To extract the files from this archive, save it to a file, remove # everything above the "!/bin/sh" line above, and type "sh file_name". # # made 01/13/1991 08:17 UTC by thad@thadlabs # Source directory /u/thad/temp # # existing files will NOT be overwritten unless -c is specified # # This shar contains: # length mode name # ------ ---------- ------------------------------------------ # 291 -rw-r--r-- Makefile # 6512 -rw-r--r-- hdhelp.c # if touch 2>&1 | fgrep 'amc' > /dev/null then TOUCH=touch else TOUCH=true fi # ============= Makefile ============== if test -f 'Makefile' -a X"$1" != X"-c"; then echo 'x - skipping Makefile (File already exists)' else echo 'x - extracting Makefile (Text)' sed 's/^X//' << 'SHAR_EOF' > 'Makefile' && X# 3B1 makefile for hdhelp X# XCC = cc XCFLAGS = -O XLDFLAGS = -s XLIBS = /lib/crt0s.o /lib/shlib.ifile XNAME = hdhelp XOBJS = hdhelp.o XDEST = /usr/local/bin X X$(NAME) : $(OBJS) X $(LD) $(LDFLAGS) -o $(NAME) $(OBJS) $(LIBS) X Xinstall : $(NAME) X mv $(NAME) $(DEST)/. X Xclean : X rm -f $(OBJS) core *~ SHAR_EOF $TOUCH -am 0113001591 'Makefile' && chmod 0644 Makefile || echo 'restore of Makefile failed' Wc_c="`wc -c < 'Makefile'`" test 291 -eq "$Wc_c" || echo 'Makefile: original size 291, current size' "$Wc_c" fi # ============= hdhelp.c ============== if test -f 'hdhelp.c' -a X"$1" != X"-c"; then echo 'x - skipping hdhelp.c (File already exists)' else echo 'x - extracting hdhelp.c (Text)' sed 's/^X//' << 'SHAR_EOF' > 'hdhelp.c' && X/* hdhelp X * X * This program helps identify the bad block(s) reported in the file X * /usr/adm/unix.log and/or on the screen of the UNIXPC/3B1/PC7300. X * X * Usage: X * X * hdhelp [ -# ] X * X * Where: # = method number if both are not desired X * X * The format of HD errors with kernels up to and including 3.51a is: X * X * HDERR ST:11 EF:40 CL:4241 CH:4201 SN:420C SC:4202 SDH:4223 \ X * DMACNT:FFFF DCRREG:93 MCRREG:8100 Tue Dec 27 02:23:51 1988 X * X * drv:0 part:2 blk:15510 rpts:1 Tue Dec 27 02:23:53 1988 X * X * The bad block can be calculated using two methods, each as a check X * on the other, depending on the available data. X * X * The first method uses the ... X */ X X#include X Xstatic char *version = "@(#) hdhelp 0.1 Thad Floryan 17-Oct-1990"; X Xmain(argc, argv) X int argc; X char *argv[]; X{ X extern int scanf(); X X int method1 = 0; X int method2 = 0; X int choice; X int CL; /* Cylinder LOW: only lower byte is significant */ X int CH; /* Cylinder HIGH: only lower byte is significant */ X int SN; /* Sector Number: only lower byte is significant */ X int SDH; /* Head Number: only lower nybble is significant */ X int num_heads; /* number of HD heads */ X int part_num; /* current partition number */ X int block_num; /* HD block number */ X int track; /* HD track number */ X int part_blocks[17]; /* partition data from s4test DIAG */ X int part_index; /* subscript for part_blocks[] */ X int block1 = 0; /* method 1 results */ X int sector1 = 0; /* method 1 results */ X int block2 = 0; /* method 2 results */ X int sector2 = 0; /* method 2 results */ X int blocks_per_track = 8; /* UNIXPC has eight 1024-byte blocks */ X /* same as sixteen 512-byte sectors */ X /* with one spare per track */ X int sectors_per_block = 2; /* UNIXPC with 1K file system (std) */ X X if (argc == 1) X { X method1 = method2 = 1; X } X else if (argc == 2) X { X choice = -atoi(argv[1]); X if (choice == 1) X { X method1 = 1; X } X else if (choice == 2) X { X method2 = 1; X } X else X { X DoUsage(argv[0]); X } X } X else X { X DoUsage(argv[0]); X } X X printf( X"You will be asked to supply several values from the HD error report found\n") ; X printf( X"in /usr/adm/unix.log and/or from the s4test DIAG report; enter each value\n") ; X printf( X"followed by a RETURN. If the data available is only that which appears\n"); X printf( X"on your UNIXPC's screen, select method 1. Be SURE to read this program's\n") ; X printf( X"accompanying documentation! You use this program at your own risk. The\n"); X printf( X"program's author believes this program to be correct, but, in ALL cases,\n"); X printf( X"you, the user, are responsible for the (mis)use, (mis)interpretation, and\n") ; X printf( X"(mis)application of this program's calculations. Be forewarned!\n"); X X PromptDec("\n\tNumber of HD heads? ", &num_heads); X X if (method1 != 0) X { X printf("\nMETHOD 1 DATA INPUT:\n\n"); X X printf( X"The values for the next 4 inputs can be found in the /usr/adm/unix.log\n"); X printf( X"on the long line which begins \"HDERR ST: ...\"\n\n"); X X PromptHex("\tvalue of CL:", &CL); X PromptHex("\tvalue of CH:", &CH); X PromptHex("\tvalue of SN:", &SN); X PromptHex("\tvalue of SDH:", &SDH); X X block1 = (CH & 0xFF) * 256 * num_heads * blocks_per_track X + (CL & 0xFF) * num_heads * blocks_per_track X + (SDH & 0x0F) * blocks_per_track X + ((SN & 0xFF) >> 1); X X sector1 = (CH & 0xFF) * 256 * num_heads * blocks_per_track X + (CL & 0xFF) * num_heads * blocks_per_track X + (SDH & 0x0F) * blocks_per_track; X sector1 *= sectors_per_block; X sector1 += (SN & 0xFF); X } X X if (method2 != 0) X { X printf("\nMETHOD 2 DATA INPUT:\n\n"); X X printf( X"The values for the next 2 inputs can be found in the /usr/adm/unix.log\n"); X printf( X"on the line which looks like \"drv:0 part:2 blk:25916 rpts:1 ...\"\n"); X printf( X"The prompt calculations are assuming %d heads as previously entered.\n\n", X num_heads); X X PromptDec("\tpart:", &part_num); X PromptDec("\t blk:", &block_num); X X printf( X"\nThe values for the next %d inputs are from the s4test DIAG disk report\n\n" , X part_num + 1); X X/* X * Ask for one more partition than needed just so no-one feels X * queasy about not entering everything on the s4test report. X * Believe me, this is important user psychology. X */ X for (part_index=track=0; part_index <= part_num; part_index++) X { X printf("\tPartition %d: start Track=%d, ", X part_index, track); X X PromptDec("size (in Blocks)=", X &part_blocks[part_index]); X X track += (part_blocks[part_index] / blocks_per_track); X X if (part_index < part_num) X { X block2 += part_blocks[part_index]; X } X } X block2 += block_num; X sector2 = block2 * sectors_per_block; X } X X if (method1 != 0) X { X printf("\nMETHOD 1 RESULTS:\n\n"); X printf("For a HD with %d heads and error report per ", X num_heads); X printf("\"CL:%04X CH:%04X SN:%04X SDH:%04X\"\n\n", X CL, CH, SN, SDH); X printf("\tThe partition 0 block number is %d\n", block1); X printf("\tThe partition 0 sector number is %d\n", sector1); X } X X if (method2 != 0) X { X printf("\nMETHOD 2 RESULTS:\n\n"); X printf( X"For a HD error on \"part:%d blk:%d\" and partitioned per:\n", X part_num, block_num); X for (part_index=track=0; part_index <= part_num; part_index++) X { X printf( X"\tPartition %d: start Track=%d, size (in Blocks)=%d\n", X part_index, track, part_blocks[part_index]); X track += (part_blocks[part_index] / blocks_per_track); X } X printf("\n\tThe partition 0 block number is %d\n", block2); X printf("\tThe partition 0 sector numbers are %d and %d\n", X sector2, sector2 + 1); X } X X if (method1 != 0 && method2 != 0) X { X if ((block1 == block2) && X ((sector1 == sector2) || (sector1 == sector2 + 1))) X { X printf( X"\nThe two methods concur, so you can proceed per the documentation.\n"); X } X else X { X printf( X"\nThe values for the blocks disagree; please check your data input.\n"); X } X } X} X X XPromptDec(msg, val) X char *msg; X int *val; X{ X extern int strlen(); X X char inbuf[81]; X X printf(msg); X fgets(inbuf, 80, stdin); X inbuf[strlen(inbuf) - 1] = '\0'; /* null out newline */ X sscanf(inbuf, "%d", val); X} X X XPromptHex(msg, val) X char *msg; X int *val; X{ X extern int strlen(); X X char inbuf[81]; X X printf(msg); X fgets(inbuf, 80, stdin); X inbuf[strlen(inbuf) - 1] = '\0'; /* null out newline */ X sscanf(inbuf, "%x", val); X} X X XDoUsage(pname) X char *pname; X{ X printf("usage: %s [ -# ]\n", pname); X printf("where: # is either 1 or 2; see program docs\n%s\n", version+5); X exit(1); X} SHAR_EOF $TOUCH -am 0113001591 'hdhelp.c' && chmod 0644 hdhelp.c || echo 'restore of hdhelp.c failed' Wc_c="`wc -c < 'hdhelp.c'`" test 6512 -eq "$Wc_c" || echo 'hdhelp.c: original size 6512, current size' "$Wc_c" fi exit 0