Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!usc!brutus.cs.uiuc.edu!jarthur!elroy.jpl.nasa.gov!ames!sgi!brendan@illyria.wpd.sgi.com From: brendan@illyria.wpd.sgi.com (Brendan Eich) Newsgroups: comp.sys.sgi Subject: Re: Intermittent Login Problems Message-ID: <52084@sgi.sgi.com> Date: 28 Feb 90 08:34:41 GMT References: <52878@bu.edu.bu.edu> <1990Feb27.171242.7976@hellgate.utah.edu> Sender: brendan@illyria.wpd.sgi.com Organization: Silicon Graphics, Inc., Mountain View, CA Lines: 57 In article <1990Feb27.171242.7976@hellgate.utah.edu>, brian@cs.utah.edu (Brian Sturgill) writes: > > [. . .] The behavior seems random. The only unusual message that I could > > find in SYSLOG was: > > > > Feb 21 14:41:22 panda grcond[10521]: In limbo > > Feb 21 14:42:07 panda grcond[10521]: Tried and failed 3 times to download > > graphics subsystem > > > > I asked our usual service person and the SGI hotline people and nobody had > > seen this message before. > > The main idea I get is that it is odd that SGI does not know about this > problem. ALL of our 4D/20's, and our 240GTX have this problem. Do you get the "Tried and failed 3 times to download graphics subsystem" message on all of your 4D/20's, or only some? On your 240GTX? The reason I ask is because very different versions of the grcond program are shipped for different models, according to their graphics hardware, and >only the 240GTX version contains the "Tried and failed" message<. Has someone inadvertently copied the 240GTX's /etc/gl/grcond to a 4D/20? Or does the message you quote in fact occur only on your 240GTX? > Looking at our SYSLOGs shows that this occurs 4.51 times per machine per day. > Often just before the limbo message we get: > > ... grcond[5015]: Child process /etc/gl/pandora exited with status 0 This SYSLOG entry was intended to be informational (LOG_INFO) only, and does not necessarily indicate a problem. Logging successful exit status does not seem useful; perhaps this unduly alarming message should be eliminated. > I do not know if the exact same mechanism is responsible, but we also > had the graphics servers crash so frequently (leaving a very large /core) that > I installed /core as a symlink to /dev/null. The graphics server meaning /bin/news_server? Was there any SYSLOG message from news_server (rather than from grcond) at the time of the coredump? > It seems odd that it is not occuring regularly at SGI on their machines. > (Perhaps they have not upgraded to 3.2 yet?) We're running 3.2, 3.2.1, 3.2.2, and what will become 3.3 in engineering, on hundreds of Iris 4Ds. Generally, engineers install and run a release long before any customers see it. The only troubles I've had with news_server, grcond, and microcode have been during development, when I used mismatched versions. I've heard, but not seen, of GT/GTX microcode problems that occasionally result in SYSLOG messages and graphics crashes. I've had no such problems with my 4D/20 in more than a year; I've been running 3.2 for about six months. > Brian Brendan Eich Silicon Graphics, Inc. brendan@sgi.com