Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!uunet!ncrlnk!wright!thor.wright.edu From: dcourte@thor.wright.edu (Dale Courte,040P Lib. Annex,873-4030,) Newsgroups: comp.sys.encore Subject: Re: telnet/rlogin "bouncing" Message-ID: <1067@thor.wright.EDU> Date: 12 Feb 90 14:38:01 GMT References: <763@sirius.ucs.adelaide.edu.au> Sender: news@wright.EDU Reply-To: dcourte@thor.wright.edu Lines: 45 From article <763@sirius.ucs.adelaide.edu.au>, by francis@chook.ua.oz (Francis Vaughan): > We have had a lot of trouble with processes hanging around, or not > correctly dieing. A lot of the new students (or worse those that had been > using VMS before) would type control-Z to stop compilations and other > things (Remember ^Z is EOF on VMS). They would then just hit break on the > annex line and think they were logged out. A kill command on the annex (or > a timeout) would send SIGHUP to all the processes. Instead of quietly dieing > we found some (in particular programs written in Pascal) would go nuts. > They would go into an infinite loop and start to allocate lots of memory. > Eventually we were forced to write a deamon to kill these off. The best > explanation we could thing of was that the signal handler under UMAX was > broken. We know it is one part that Encore had rewritten. I have seen a lot of this also. We had a particular problem with Franz Lisp. As above, stopped jobs were not killed when the user logged off, they were re-started in some sort of hard loop. My load average would get up close to 10, I'd look and find five or six spinning copies of Lisp. Through a large amount of experimentation, I found that this did not happen when using the Korn shell. Since ksh is a nice shell anyway, including job control and command line recall, I decided we would just use ksh as our default login shell. We converted over, and the problem with Franz Lisp disappeared. However, a similar problem developed with, believe it or not, mail! Every day when I logged in I had to kill one or two mail processes which had racked up hundreds of minutes of CPU time in a hard loop. Bizarre. What could mail be doing? My conclusion was also that the signal handler was broken, and when making a service call to Encore, the person I talked to seemed to agree. Thay had been able to duplicate the Franz Lisp problem. The workaround I have in place now is to place a ksh ulimit command in /etc/profile (which is executed when ksh users log in), limiting the cpu time for a process to 5 minutes. Users who need to run long jobs can reset this limit. So the mail processes spin for five minutes, then die. Users who override this limit tend to be more sophisticated Unix people and don't leave jobs hanging around, so this solution has worked quite well. The C shell has similar cpu limiting commands, but no single file equivalent to /etc/profile that I know of. -Dale Courte, University Computing Services' Unix Systems Administrator email: dcourte (dcourte@eve.wright.edu) phone: 873-4030 office: 040P Lib. Annex