Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!rutgers!seismo!mcvax!ukc!dcl-cs!strath-cs!jim From: jim@strath-cs.UUCP Newsgroups: comp.unix.wizards Subject: Re: SI 9900 hangs Message-ID: <654@stracs.cs.strath.ac.uk> Date: Fri, 31-Jul-87 07:34:51 EDT Article-I.D.: stracs.654 Posted: Fri Jul 31 07:34:51 1987 Date-Received: Wed, 19-Aug-87 07:06:24 EDT References: <8269@brl-adm.ARPA> <3321@cit-vax.Caltech.Edu> Reply-To: jim@cs.strath.ac.uk Organization: Comp. Sci. Dept., Strathclyde Univ., Scotland. Lines: 33 In article <3321@cit-vax.Caltech.Edu> mangler@cit-vax.Caltech.Edu (System Mangler) writes: >In article <8269@brl-adm.ARPA>, eichelbe@nadc.arpa (J. Eichelberger) writes: >> Every so often we get a system hang. All activity on the SI controller >> stops. Hitting the reset on the toggle switch restarts everything. Most >> of the time we don't see any error messages. If we do see one, it's >> hp0: not ready > >The SI local office says this is a common problem, typically caused >by marginal power supply output. I've not had the chance to verify >this (there isn't enough downtime on this machine for my purposes). We had this problem quite often just after we installed a 9900. The SI engineer tweaked the voltage on the 9900 power supply and the hangs stopped happening. Eighteen months later, we've had no repetitions. The engineer told me that the 9900 is susceptible to mains glitches if the DC voltage isn't *exactly* right. The scenario he explained was: "A mains glitch causes the 5.05 V supply to drop for a moment and come back again. The momentary loss of power stops the controller (it thinks the mains power is lost) but is not long enough for the 9900 to re-initialise itself properly. Flicking the reset switch on the controller does this and everything picks up from where it left off." [I suppose this may be too optimistic: the transfer in progress at the time of the hang could be totally screwed if the controller's buffer gets mangled by the loss of volts and the controller firmware still considers the buffer contents valid. We didn't see file or filesystem corruption when we reset the hangs, though.] I've no idea why having the DC supply exactly right cures this, but then I know next to nothing about hardware or electronics. Jim