Path: utzoo!attcan!uunet!cs.utexas.edu!rutgers!att!mcdchg!ddsw1!karl From: karl@ddsw1.MCS.COM (Karl Denninger) Newsgroups: comp.unix.xenix Subject: Re: 2.3.1 text corruption Summary: It's not a disk problem, but it probably IS hardware-related. Message-ID: <3576@ddsw1.MCS.COM> Date: 7 Jun 89 17:50:21 GMT References: <26353@lll-winken.LLNL.GOV> <133@unifax.UUCP> <1124@jpusa1.UUCP> Reply-To: karl@ddsw1.MCS.COM (Karl Denninger) Organization: Macro Computer Solutions, Inc., Mundelein, IL Lines: 50 In article <1124@jpusa1.UUCP> stu@jpusa1.chi.il.us (Stu Heiss,6312,6334,) writes: >In article <133@unifax.UUCP> sl@unifax.UUCP (Stuart Lynne) writes: >-In article <26353@lll-winken.LLNL.GOV> carlson@lll-winken.LLNL.GOV (Joe Carlson) writes: >-}2.3.1. Basically it appears that the in-core version of certain heavily >-}used programs appears to get corrupted every once in a while. I believe that >-}I have eliminated hardware trouble as the cause of this. >- >-I have also seen this problem on another system with a flakey swap area. >-You might want to check that you don't have any bad blocks in your swap >-area. > >I have also observed this but never considered the possibility of a disk >problem. I do recall some discussion about bad track remapping not >working for the swap area. Is this related or does anyone from sco have >any further info? I have checked into this, and it's not the problem. If it was, I would expect to see a disk error message preceeding the problems -- that has never occurred here. We saw the problem too, but worse. Not only would I get wierd crashes from some programs, but also TRAP IN SYSTEM MODE panics! Moving around a couple of boards seems to have fixed it. If you have halfway flakey hardware, watch out -- you'll get all kinds of wierd problems, none of which your POST or diags will catch! I believe that the tape controller was interfering with the disk controller -- since moving the tape controller to a slot away from the drive controllers we haven't seen the problem recur.... Check your hardware -- carefully. I'll keep the net posted if the gremlins come back to 'ddsw1'..... So far we're two days and counting without a problem under heavy load. All this started here when I added a second controller and third fixed disk, and put the controller too close to the tape controller board (an archive controller... guess it's noisy or something). The problem that appeared to be SCO not remapping bad sectors in the swap area turned out to be a SECOND bad sector in the swap area! We mapped that one out too, and now all is ok in that regard -- no more fixed disk errors. Btw: The second controller support works beautifully, and the system appears to multithread I/O requests with two boards in there (ie: both disk access lights are on at the same time!!) Nice job SCO! -- Karl Denninger (karl@ddsw1.MCS.COM, !ddsw1!karl) Public Access Data Line: [+1 312 566-8911], Voice: [+1 312 566-8910] Macro Computer Solutions, Inc. "Quality Solutions at a Fair Price"