Path: utzoo!attcan!uunet!husc6!cmcl2!rutgers!att!codas!flnexus!pcsi!peora!rtmvax!tarpit!rd From: rd@tarpit.UUCP (Bob Thrush) Newsgroups: comp.unix.microport Subject: Re: How does Microport System V/AT handle bad blocks? Message-ID: <464@tarpit.UUCP> Date: 21 Dec 88 01:14:17 GMT References: <460@tarpit.UUCP> <326@focsys.UUCP> Reply-To: rd@tarpit.UUCP (Bob Thrush) Organization: Automation Intelligence,Inc; Orlando,FL Lines: 66 In article <326@focsys.UUCP> larry@focsys.UUCP (Larry Williamson) writes: >In article <460@tarpit.UUCP> rd@tarpit.UUCP (Bob Thrush) writes: >>About 3 months ago, the 2nd drive on this System V/AT 2.3.1 system died. >> [...] I have been noticing intermittent "HD I/O Errors ..." >>[...] Exactly what do these messages mean? > >This means, you've got trouble. [...] >We upgraded to 2.4 and errors have disappeared completely. We also replaced >the disk, I couldn't bring myself to trust it. > >I'm not sure why, but it seemed that the disk errors grew at an exponential >rate. I would therefore suggest that you *very quickly*, get your 2.4 upgrade >and install it. I would also suggest that you verify your backups, you might >be surprised by what is on (or not on) those tapes! Larry, thanks for the advice. I've had 2.4 since it was announced. However, I have heard (in this newsgroup) so many reports of problems with 2.4, ie. keyboard lockups, different curses problems (that I found workarounds to in 2.3.1) that I have been reluctant to trade in the devil I (sort of) know for 2.4. Are these problem reports regarding 2.4 not as serious as a casual reader would assume? Has Microport made any comment regarding the 2.4 problem reports? The 2nd disk (that I'm having trouble with) is mostly used as the news spool directory, so it is definitely getting a whole lot different activity than it did before the onset of the problems. Each time the problem shows up, I find that each subsequent fsck finds more problems, usually associated with duplicates in the free list. I wind up mkfs'ing the news file system to correct(?) the problem. I am usually able to restore most of the news spool directory from a backup tape made when I first notice a problem (I don't backup news routinely). I have noticed that one cpio was hosed part way in. When restoring, cpio reported something like "the archive is not in cpio format". I investigated this further on a Tektronix workstation that was able to read the "cpio -ocv" format and found 2 places where the cpio header contained (probably) the correct file size but the following data was short by exactly 8192 bytes. I edited the headers (subtracted 8192 from the size) and was able to successfully restore from the tape. Fortunately, the two truncated articles were not in newsgroups that our site regularly reads. I'm tempted to rebuild my 2.3.1 kernel with the hard disk driver from 2.4 to narrow down the problem. Any comments from the net or Microport regarding this possibility? Since I'm leaving ASAP for Xmas holiday, I won't be responding soon to this group; however, I will followup when I return. BTW, I got a complete rundown of the meaning of the hard disk i/o errors from Randy Jarrett who copied a posting <358@uport.UUCP> by Marc de Groot (then of Microport). When I return from the holidays, I'll repost that if there is interest. Thanks, Randy (and Marc). I'm still interested in knowing how Microport System V/AT handles bad blocks. > >Good Luck, > Larry > >-- >Larry Williamson -- Focus Systems -- Waterloo, Ontario > watmath!focsys!larry (519) 746-4918 -- Bob Thrush UUCP: {rtmvax,ucf-cs}!tarpit!rd Automation Intelligence, 1200 W. Colonial Drive, Orlando, Florida 32804