Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site noao.UUCP Path: utzoo!linus!philabs!cmcl2!seismo!hao!noao!grandi From: grandi@noao.UUCP (Steve Grandi) Newsgroups: net.bugs.4bsd Subject: Re: mchk 2 --- tbuf error on 750 running 4.2 BSD Message-ID: <427@carina.noao.UUCP> Date: Tue, 30-Jul-85 12:56:05 EDT Article-I.D.: carina.427 Posted: Tue Jul 30 12:56:05 1985 Date-Received: Thu, 1-Aug-85 21:21:40 EDT References: <83@zeta.UUCP> <654@gatech.CSNET> <2496@sun.uucp> Distribution: net Organization: Natl. Optical Astronomy Observatories, Tucson, AZ USA Lines: 41 > Okay, I will. I already mailed John, but perhaps this could be rehashed > one more time. The problem does lie in the L0003 board, but the solution > is easy. VMS has microcode to alleviate these parity problems, and > using the /boot program which reads microcode off the disk, the problem > can be easily solved. Mike Karels wrote up a patch and we have been running Unfortunately, loading the proper microcode is not the complete solution. Witness the following console output-- Jul 4 02:10 machine check 2: cp tbuf par fault va 802246f4 errpc 8001433b mdr 505 smr 8 rdtimo 0 tbgpar 3 cacherr 1 buserr 8 mcesr c pc 80014336 psl c00000 mcsr 80318 panic: mchk trap type 2, code = 0, pc = 80000fa2 panic: Reserved operand trap type 2, code = 0, pc = 80000fa2 panic: Reserved operand trap type 2, code = 0, pc = 80000fa2 panic: Reserved operand trap type 2, code = 0, pc = 80000fa2 panic: Reserved operand trap type 2, code = 0, pc = 80000fa2 panic: Reserved operand 4.2 BSD UNIX #5: Mon Jun 24 17:12:19 MST 1985 real mem = 5238784 avail mem = 4198400 using 231 buffers containing 524288 bytes of memory etc. Maybe the combination of microcode rev. 98 (which we are already using) and the rev. 7 L003 board (which will be installed Someday, Real Soon Now) will cure the problem and eliminate these irritating crashs. But I doubt it. Now the real question: Does anyone know why the system sometimes goes into the mchk/Reserved operand panic loop shown above instead of trying its normal recovery? This happens on about half of our tbuf parity faults. -- Steve Grandi, National Optical Astronomy Observatories, Tucson, AZ, 602-325-9228 {arizona,decvax,hao,ihnp4,seismo}!noao!grandi noao!grandi@lbl-csam.ARPA