Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!cmcl2!brl-adm!umd5!decuac!felix!chuck From: mouse@uunet.UU.NET (der Mouse) Newsgroups: comp.unix.ultrix Subject: Re: killpg(2) fails (sometimes) Message-ID: <13290@felix.UUCP> Date: Wed, 18-Nov-87 12:12:54 EST Article-I.D.: felix.13290 Posted: Wed Nov 18 12:12:54 1987 Date-Received: Sat, 21-Nov-87 08:29:54 EST References: <10364@felix.UUCP> Sender: chuck@felix.UUCP Reply-To: mouse@uunet.UU.NET (der Mouse) Lines: 57 Approved: zemon@felix.UUCP Reply-Path: In article <10364@felix.UUCP>, gordon@prls.UUCP (Gordon Vickers) writes: > I am currently running 1.2 though I've had this problem since 1.0. I > have a very simple interface to the killpg(2) call but it fails if > the parent process was started in rc.local. > [...] > The killgp(2) call has worked flawlessly everytime except for those > occations when xxx was started from /etc/rc.local. This is not peculiar to Ultrix. I have been checking mtXinu 4.3+NFS source on this subject; I expect that this code is nearly identical in all Berkeley derivatives. The pronouncements below about what is and isn't done are thus based on the mtXinu 4.3+NFS code. Where do you get the process group you send the signal to? Do you simply assume it is identical to the process ID of the parent process or do you use getpgrp()? If the former, bad boy - fix it, 'cause otherwise it'll bite you eventually (like when you use this technique on a process started as part of a pipeline). If the latter, note the returned process group (it's 0 for rc.local processes, right?). Note also that /bin/sh, which interprets rc.local, does not grok process groups. All children of a sh inherit the sh's process group, which was given to it by its parent. In the case of rc.local, this runs all the way back to init. Init's process group is never set and is therefore zero. Sending a signal to process group 0 with killpg(2) is interpreted as a request to send to the process group of the sending process. Thus, you wind up sending the signal to the process supposedly doing the sending instead of to the process you wanted to send it to. So how do you reliably kill the whole tree? Recommendation one: At startup, have the parent process check to see whether it was started from rc.local and if so to set its process group to match its process ID. To check for rc.local, you can check for a process group ID of zero or for a missing control terminal (try to open /dev/tty), both, or something else if it occurs to you. Recommendation two: Have your killing program know a little about the internals of the kernel, a la ps, and have it run through the process tree finding and killing the child processes. Recommendation three: Have the child processes periodically check to see whether the parent process has died (do getppid() at startup and periodically try sending signal 0), and make them go away if so. Recommendation four: Have the parent process trap the signal and have it kill all the children before dying. If none of the above are satisfactory, send me mail explaining in greater detail and I'll see if I can come up with any better ideas. der Mouse (mouse@mcgill-power:56 you