Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!apple!bionet!csd4.milw.wisc.edu!lll-winken!uunet!crdgw1!sungod!davidsen From: davidsen@sungod.crd.ge.com (William Davidsen) Newsgroups: comp.unix.questions Subject: Re: Summary - How to tell if a process is active Keywords: process kill ps Message-ID: <938@crdgw1.crd.ge.com> Date: 21 Jun 89 18:52:31 GMT References: <2848@infocenter.UUCP> <763@ctisbv.UUCP> Sender: news@crdgw1.crd.ge.com Reply-To: davidsen@crdos1.UUCP (bill davidsen) Organization: General Electric Corp. R&D, Schenectady, NY Lines: 51 In article <763@ctisbv.UUCP> pim@ctisbv.UUCP (Pim Zandbergen) writes: | But as our application is mainly turnkey based, I have seen more | then once that checking the pid only is not enough. Our customers | turn on the machine, and go right away into the application. | At that time a resource is being claimed. Then there is a system crash, | the system is rebooted, and the application is restarteds, | AND IS RUNNING WITH THE EXACT SAME PID! Hence, when it finds | the lockfile, it checks for its pid and finds out it exists, | and fails to claim the resource. The second time the application | is started it will continue without failure. | | So I am looking for some way to put some extra information into | the lockfile to find out if the machine has been rebooted | since the resource claim. What is the most obvious and portable | way to do this? If I understand what you're trying to do, you can't solve the problem in the application. My first thought was: while NOT got_resource if open_file == OKAY read PID form file if PID == my_PID got_resource else signal zero to PID if no_process got_resource else { your favorite wait logic here, or terminate } fi fi else got_resource fi wend create lockfile write my_PID In addition to the possible race conditions present with lockfile use in general, this doesn't catch the case where the system is restarted and the stale PID is that of a valid process which doesn't have the resource. In that case you won't detect the problem in the process trying to get the resource. My suggestion is to fix your startup logic to eliminate the lockfiles in the first place. Then the whole problem falls out. Sorry I don't have a better idea. My startup has a list of things to "rm -f" before going multiuser. bill davidsen (davidsen@crdos1.crd.GE.COM) {uunet | philabs}!crdgw1!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me