Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cornell!uw-beaver!rice!sun-spots-request From: mcvax!cs.hw.ac.uk!davidf@uunet.uu.net (David.J.Ferbrache) Newsgroups: comp.sys.sun Subject: Reaping zombie processes Message-ID: <412@odin.cs.hw.ac.uk> Date: 3 Mar 89 21:21:49 GMT References: <8902071950.AA18464@helios> Sender: usenet@rice.edu Organization: Computer Science, Heriot-Watt U., Scotland Lines: 42 Approved: Sun-Spots@rice.edu Original-Date: 18 Feb 89 12:19:23 GMT X-Sun-Spots-Digest: Volume 7, Issue 179, message 2 of 11 root%helios.UCSC.EDU@ucscc.ucsc.edu (De Clarke (Systems Mgr)): > 1) Has anyone managed to kill off user processes that go ? Mine > hang about for anywhere from a day to three days before they finally drop > off. Since they do drop off in the end (I guess the lbolt that they're > waiting on finally hits) there must be a way to force them out by making > the lbolt arrive sooner, right? I asked Sun and they said basically > "reboot." (Great answer >: -( ) So thought I'd ask the public. As far as I know (as the moderator said) there is no safe way of terminating such a process, there is however a slightly dodgy (it involves manipulation of the kernel proc structures) method. Basically the method involves running a program (called reaper on our systems) which scans the process table for zombie entries (examining the stat field in the process structure entry for SZOMB states). When a zombie process is found the program rewrites the process structure for that process modifying the PPID field to contain the process id of the reaper program. The reaper program can then wait on its newly adopted child process which should affect an orderly clean up of process resources. This method works in BSD 4.2 releases which do not contain the rewritten process table code. The new process table code replaces a search for processes based on a hash of the pid, with no explicit structuring of child-parent relations excepting the ppid field, with a tree structure of pointers lacing together the child and parent processes. In such a more complex environment rewritting the pointer fields is potentially far more dangerous. The reaper program (which runs on Sun 3 os 3.5, Orion HLH and BSD 4.2) seems to work for all processes which are actually in the zombie state, but not for processes which are blocked waiting on a close of a file in the _exit routines. Anyway, if anybody is remotely interested I will mail them the source. Dave Ferbrache Personal mail to: Dept of computer science Internet Heriot-Watt University Janet 79 Grassmarket UUCP ..!mcvax!hwcs!davidf Edinburgh,UK. EH1 2HJ Tel (UK) 031-225-6465 ext 553