Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!uakari.primate.wisc.edu!ames!ames.arc.nasa.gov!lamaster From: lamaster@ames.arc.nasa.gov (Hugh LaMaster) Newsgroups: comp.arch Subject: Re: Reliability Keywords: parity checkers detection Message-ID: <40767@ames.arc.nasa.gov> Date: 17 Jan 90 20:31:45 GMT References: <34030@mips.mips.COM> <4322@nttmhs.ntt.JP> <39807@ames.arc.nasa.gov> <3101@umn-d-ub.D.UMN.EDU> <28674@amdcad.AMD.COM> <7566@pt.cs.cmu.edu> <34469@mips.mips.COM> <7608@pt.cs.cmu.edu> <15679@haddock.ima.isc.com> Sender: usenet@ames.arc.nasa.gov Organization: NASA - Ames Research Center Lines: 22 In article <15679@haddock.ima.isc.com> suitti@anchovy.UUCP (Stephen Uitti) writes: >In article <7608@pt.cs.cmu.edu> lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) writes: >clock. If you want reliability, why not periodically run >dignostics? A 'cron' job could run at 3 AM and report any In fact, it is not unheard of to run ALU diagnostics in "idle". When you go beyond that, to memory, channels, disks, etc., you can easily start hurting the performance of other things going on (by flushing the cache(s), by affecting other processors and processes, by thrashing the disks, or using network bandwidth...) But, it is a good idea to check the ALU when the system is on idle, as long as you can do it without hurting performance. The systems where I saw it done in the past did not have caches, so a few stray memory references were not a big deal. I am not sure if you could write an effective ALU diagnostic that didn't have the effect of flushing the cache after a few million instructions... Does anyone know if any Unix kernels have this capability? Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)694-6117