Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!aplcen!haven!umd5!hans From: hans@umd5.umd.edu (Hans Breitenlohner) Newsgroups: comp.unix.ultrix Subject: Re: hanging jobs Message-ID: <5773@umd5.umd.edu> Date: 12 Dec 89 21:28:12 GMT References: Reply-To: hans@umd5.umd.edu (Hans Breitenlohner) Distribution: comp Organization: University of Maryland, College Park Lines: 140 In article saus@media-lab.media.mit.edu (Mark Sausville) writes: > >I haven't seen anyone griping about this but it's getting bad enough >that I thought I'd ask: > >Ultrix 3.1 on VAX 6320 > >Certain processes seem to hang around doing something long after the >users who had initiated them have gone home. Emacs (gnu 18.54) and >mail (/usr/ucb/mail - ultrix) are two of the more prominent offending >programs. > >One notices these processes by running top(1). They typically show up >as having used minutes of cpu time which serves to distinguish them >from the live programs which typically utilize seconds of cpu time. > >Often, these jobs sit there eating lots of CPU doing who knows what. >My guess is that they are polling hard for input. > >When users are queried about these processes they usually say, "Huh, >what emacs (mail) job?" > >Anybody else seeing something like this? > > Mark. > >Mark Sausville MIT Media Laboratory >Computer Systems Administrator Room E15-354 >617-253-0325 20 Ames Street >saus@media-lab.media.mit.edu Cambridge, MA 02139 Yes, I have seen several cases of this. 1. /bin/sh -- if you close a telnet connection while it hangs on a read, it will loop. Our solution was to have the offending shell script use /bin/sh5 instead. 2. /usr/ucb/mail -- if you close a telnet connection while it is doing something other than waiting for a command, it will spin in a loop of the following form: do { ... getc(ibuf) ... } while (ferror(ibuf) && ibuf == stdin); on the mistaken assumption that errors on stdin are transient, and that eventually an EOF will be returned. This bug is particularly insidious, as it is most likely to show up when your system gets very busy. Berkeley has fixed the problem, but the /usr/ucb/mail in Ultrix is based on 6 year old Berkeley sources. Below are changes (to Ultrix 3.0 sources) which fixed this problem for us. Your Mileage may vary. If you don't have sources you will have to invoke adb to fix this one, which will be a challenge since the executable is stripped. 3. I have also seen emacs processes, but never pursued that problem. Hope this helps. Hans *** fio.c.old Wed Oct 4 13:19:48 1989 --- fio.c Thu Oct 5 18:59:13 1989 *************** *** 175,180 return(c+1); } /* * Quickly read a line from the specified input into the line * buffer; return characters read. --- 175,181 ----- return(c+1); } + #if 0 /* * Quickly read a line from the specified input into the line * buffer; return characters read. *************** *** 206,211 *cp = 0; return(cp - linebuf + 1); } /* * Read up a line from the specified input into the line --- 207,213 ----- *cp = 0; return(cp - linebuf + 1); } + #endif /* * Read up a line from the specified input into the line *************** *** 220,226 register char *cp; register int c; - do { /*read while no errs & stdin ==file*/ clearerr(ibuf); /*reset err indication on input*/ c = getc(ibuf); for (cp = linebuf; c != '\n' && c != EOF; c = getc(ibuf)) { --- 222,227 ----- register char *cp; register int c; clearerr(ibuf); /*reset err indication on input*/ c = getc(ibuf); for (cp = linebuf; c != '\n' && c != EOF; c = getc(ibuf)) { *************** *** 229,235 if (cp - linebuf < LINESIZE-2) *cp++ = c; } - } while (ferror(ibuf) && ibuf == stdin); *cp = 0; /*terminates line*/ if (c == EOF && cp == linebuf) /*if @ beginning of line & char=EOF*/ return(0); /* then return 0*/ --- 230,235 ----- if (cp - linebuf < LINESIZE-2) *cp++ = c; } *cp = 0; /*terminates line*/ if (c == EOF && cp == linebuf) /*if @ beginning of line & char=EOF*/ return(0); /* then return 0*/