Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!zaphod.mps.ohio-state.edu!sdd.hp.com!hp-pcd!hpfcso!hpfcmgw!glen From: glen@hpfcmgw.HP.COM (Glen Robinson) Newsgroups: comp.sys.hp Subject: Re: 9000/370 Problems... Message-ID: <1080162@hpfcmgw.HP.COM> Date: 26 Jul 90 17:49:15 GMT References: <13484@udenva.cair.du.edu> Organization: HP Fort Collins, CO Lines: 43 The H-P field guys keep insisting that it is a power problem because that is the most likely cause of parity panics on a 370. Parity can ONLY occur on a read from memory (that is the only time it is checked and this is done by hardware not software). Therefore the problem could have only happened during the previous write to the location or at some time after the initial write. Problems that occur duuring writes to a memory location can quite reliably be found with memory diagnostics including the boot time memory diagnostic. Problems occuring after a cell is written usually are caused by one of two things: 1. A cell changes due to a soft error caused for instance by an alpha particle hit. 2. A cell changes due to a voltage transient on logic ground. In this scenario the specific cells affected by such a transient are those that are 'weakest' at the time. While many cells might be changed you will only know about the first one that a read attempt is made on (i.e., the one that generates the parity panic). Note that in the two cases above the location of failure will probably be random. The design is extremely robust in handling spikes or large transients across AC neutral and AC phase, however, in order to pass VDE, class B, et. al. the designed separated AC neutral, Safety Ground and Logic Ground. In normal user ac power situations this is no problem. However, when the user has problems such as floating grounds, or peripherals on one phase and the computer on another phase (or whatever) a measurement of the rms voltage between AC neutral and Safety ground will indicate the problem. The Model 370 will NOT tolerate voltage greater than 1 volt rms between these two lines. Often a power line monitor is required in order to catch transients across these two lines which sometimes occur as the result of an external event (elevator motor, or ..). To put all of this into perspective. There are a lot of Model 370's out there (in the tens of thousands). You can count the sites that have experienced recurring parity problems on one hand. In every previous case we have found that curing input power problems solved the parity problems. The normal comments about this not bein an official postion of H-P etc. apply. Glen Robinson