Xref: utzoo comp.sys.att:8722 unix-pc.general:4780 Path: utzoo!utgpu!watserv1!watmath!uunet!cs.utexas.edu!wuarchive!mit-eddie!uw-beaver!sumax!polari!rwing!pat From: pat@rwing.UUCP (Pat Myrto) Newsgroups: comp.sys.att,unix-pc.general Subject: Re: Fixdisk problems? Summary: fixdisk kernel panic Message-ID: <1054@rwing.UUCP> Date: 7 Feb 90 16:40:36 GMT References: <111@spirit.UUCP> Distribution: na Organization: Very Little Organization, Seattle WA Lines: 95 In article <111@spirit.UUCP>, john@spirit.UUCP (John F. Godfrey) writes: > ... [ edited to reduce length ] ... > Early last week I installed Fixdisk 2.0 on spirit ... [with a] 67mb > ST-4096 and DOS-73... Shortly after installation I received the panic > message which will follow... [after reboot] ... it paniced again. > > Here is the panic message: > ---------------------------------------------------------------------- > #WD1010 ST=/Sekg/Err/ EF=/Id?/ cy=710. sc=14. hd=7. dr#=0. MCR2:0x0 > #HDERR ST:51 EF:10 CL:C6 CH:2 SN:E SC:2 SDH:27 DMACNT:FFFF DCRREG:9F > MCCREG:8300 > > panic: Hard disk timeout > ---------------------------------------------------------------------- It's hard to say - I have seen that sort of panic before, but only once. It sounds like the drive wasn't seeking - like the seek mech was jammed, or something. I had it happen with a ST 251, and rebooting didn't help, till the power was cycled - from the sounds it made, that sort of "kicked" it loose. It is possible your problems are of a similar nature. Even with you changing back to the old kernel and things appearing to be fixed due to this, its still possible that it was a coincidence, the operations, reboot cycles, etc that got done when you restored the old kernel was what restored sanity. I have also installed the new fixdisk, and it has been running fine for over a week, till today, where I got a "kernel parity" panic. I didn't copy the message down, but it mentioned a disk parity error (though nothing was in unix.log). I am convinced that occasionally things such as this do happen. If it happens again, with the same problem, then I will be concerned. Obviously things are running fine now, as the involved system is the one I am typing this prose on. Once I had an entry in unix.log appear where it couldn't read head 0, sector 0 cylinder 0, and bailed out with a "drive not ready" error - if for real, a very grave symptom. However, this was months ago, and after rebooting, it hasn't happened since. I did selectively installed the fixdisk, instead of using the provided Install script (because some stuff in the FIXDISK pkg I don't use anymore, and because I tend to be leery of Install scripts in general, especially ones that do such sweeping things as the FIXDISK one must do). Following is what *I* would do, if I were in the same situation. I probably am going into excess detail, but in this case that might be preferable than assuming too much. The procedure I used for installing the FIXDISK worked for me, and this is being written in good faith, but since I have no control over how this will be read or interpreted, *YOU ARE ON YOUR OWN*. NO CLAIMS ARE MADE AS TO THIS BEING CORRECT OR BEING FREE OF LOGICAL OR TYPOGRAPHICAL ERRORS, OR BEING WITHOUT CRITICAL OMISSIONS. Before writing off the FIXDISK2.0, I would suggest re-trying the FIXDISK (a different copy of it, if it was a downloaded copy), and installing it BY HAND, rather than using the Install script - this allows one to selectively install fixes, and to do it in stages, as I suggest below, starting with the kernel, which provides most of the major fixes, other than the uucico (uucico not being relevant if HDB is installed), and the fix for the occasional corrupted /etc/utmp file. I suggest you try unarchiving the fixdisk into a work subdir, (its a cpio archive, and assuming FIXDISK2.0+IN is in the parent subdir, the command ``cpio -iBcdm <../FIXDISK20+IN'' run as root, into an empty subdir will extract the contents, preserving the original dates, perms, and ownership of the files). If its on the floppies, replace the "../FIXDISK2.0+IN" with "/dev/rfp021". In the subdir 'kernel', unpack the kernel file (`` unpack UNIX3.51m'') and then copy the new kernel to /UNIX3.51m. Verify the permissions are at least 754, owner/group root/sys (depending on how things are set up, you may need to have world read perms on the kernel). Follow with ``mv /unix /unix.old'', (to preserve the old kernel, in case the UNIX3.5? link isn't there) and then do ``ln /UNIX3.51m /unix''. Once the above steps are done and checked for correctness, do a normal shutdown and reboot. If the system comes up OK, and gets past the time interval where you originally experienced the problems, then I would try replacing /etc/lddrv/wind.o, /etc/init, /bin/login, and /bin/getty, etc., MANUALLY, BY HAND, with the files provided in the kernel, utmp, subdirs, preserving the original versions as /bin/login.old, /etc/lddrv/wind.o.old, etc. You can inspect the Install script for the proper permissions and owner/group to use on each file (most will be owner=bin, group=bin). Be sure that after the new init is copied in, to rm /bin/telinit and then do ``ln /bin/telinit /etc/init'' (some stuff does look for /bin/telinit, even possibly during reboot sequence). After verifying everything is right, again doing the shutdown and reboot. If the panics happen again, I have no suggestions. Perhaps someone can answer - does 3.51 require a new format on the drive that had previously been formatted with, say, 3.0 or 3.5? As I said, your mileage may vary, but good luck - just proceed slowly and carefully. -- pat@rwing (Pat Myrto), Seattle, WA ...!uunet!pilchuck!rwing!pat ...!uw-beaver!uw-entropy!dataio!/ WISDOM: "Travelling unarmed is like boating without a life jacket"