Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!lll-winken!ames!sgi!rpw3@rigden.wpd.sgi.com From: rpw3@rigden.wpd.sgi.com (Robert P. Warnock) Newsgroups: comp.arch Subject: Re: The Killer Micro From Hell [actually, reliability] Keywords: Tandem Cyclone Message-ID: <48540@sgi.sgi.com> Date: 18 Jan 90 04:49:01 GMT References: <34030@mips.mips.COM> <4322@nttmhs.ntt.JP> <39807@ames.arc.nasa.gov> <3101@umn-d-ub.D.UMN.EDU> <28674@amdcad.AMD.COM> <7566@pt.cs.cmu.edu> <34469@mips.mips.COM> <40694@ames.arc.nasa.gov> Sender: rpw3@rigden.wpd.sgi.com Reply-To: rpw3@rigden.UUCP (Robert P. Warnock) Organization: Silicon Graphics, Inc., Mountain View, CA Lines: 35 In article <40694@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes: +--------------- | In article <34469@mips.mips.COM> mash@mips.COM (John Mashey) writes: | > How about parity on ALU operations? | ...really fun to be in a central service bureau the day you discover that | your F.P. operations have been broken for the last three days :-) +--------------- Real live case that "mere" parity on the ALU ops wouldn't have caught: Circa 1970 at the Emory University Chemistry Department we had a DEC PDP-10 (KA10) which started giving wrong answers on "a few" programs (actually, only on one or two specific input data sets on each of one or two programs). Turned out there was a transistor going leaky on the clear line of the latch (part of the instruction register) that stored which general register the results of a floating-point op gotten written back to. Some input data sets made enough "noise" in the floating point that "occasionally" floating-point instructions wrote their results to AC0 instead of whichever AC they were supposed to. AC0 was FORTRAN's subroutine value return reg, so generally the stomping had no "obvious" disastrous effects -- no wild array ref's, no wild jumps. And of course the kernel uses no fl-pt.) *HARD* to find; trivial to fix. ALU parity adds some confidence, but not certainty. And no help in this case. -Rob ----- Rob Warnock, MS-9U/510 rpw3@sgi.com rpw3@pei.com Silicon Graphics, Inc. (415)335-1673 Protocol Engines, Inc. 2011 N. Shoreline Blvd. Mountain View, CA 94039-7311