Xref: utzoo comp.sys.att:5611 unix-pc.general:2299 Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!osu-cis!n8emr!uncle!jbm From: jbm@uncle.UUCP (John B. Milton) Newsgroups: comp.sys.att,unix-pc.general Subject: Re: Summary: Hard disk errors on a 3b1; HwNote13 Keywords: HDERR 3b1 disk errors Seagate HwNote Message-ID: <485@uncle.UUCP> Date: 23 Feb 89 04:58:26 GMT References: <388@ntvax.UUCP> <465@manta.pha.pa.us> <478@uncle.UUCP> <468@manta.pha.pa.us> Reply-To: jbm@uncle.UUCP (John B. Milton) Organization: U.N.C.L.E. Lines: 47 In article <468@manta.pha.pa.us> brant@manta.pha.pa.us (Brant Cheikes) writes: >Let me begin with a sincere and appreciative public THANK YOU! to John You're welcome. ... >>Ahh! Wouldn't you know it! I've got news stomping on my soft blocks! Excuse my attempt at levity. Yes, there is some concern here. In my case I have not gotten anymore hits on these spots. A good way to check whether a certain HDERR is hard (always bad), soft (sometimes bad), or transient (usually not related to the disk at all), is to: cp /dev/rfp000 /dev/null Ignore the "bad copy to /dev/null", and check /usr/adm/unix.log to see if you have any new messages. Track down the file, and: ln file /usr/adm/bad+junk Just to make sure the bad spot doesn't get loose. Repeat this at different times of the day. Try to pick times when your machine is under extreme conditions: 5-6p.m. for lowest line voltage, 3a.m. for highest. Afternoon, or whenever for highest temperature, etc. You might even set up a temporary cron line to do this. You could also kick a second one off 5 minutes after the first to see if your errors are seek related. Don't do too much of this, as it can put a lot of wear on the moving parts of you head assembly! REMEMBER! when smgr finds that /usr/adm/unix.log has exceeded 10k in size, it quietly deletes it! Shame on you if you don't have something like this run out of cron every night: cd /usr/adm if [ -f unix.log ]; then cat unix.log >>UNIX.log rm unix.log fi About once a month, I go through this file and delete all the FDERR lines from floppy formatting. After you have collected enough HDERR lines, you can get all the suspect files in one place and flog them to get a feel for how "hard" a given bad spot is. If you get a continuous stream of transient (one hit) spots when scanning the whole disk, it is probably electronics, and not the hard disk surface. John -- John Bly Milton IV, jbm@uncle.UUCP, n8emr!uncle!jbm@osu-cis.cis.ohio-state.edu (614) h:294-4823, w:764-2933; Got any good 74LS503 circuits?