Path: utzoo!attcan!uunet!munnari!otc!malcolm
From: malcolm@otc.oz (Malcolm Purvis )
Newsgroups: comp.unix.wizards
Subject: Re: bug in pclose(3)
Message-ID: <523@otc.oz>
Date: 21 Dec 88 23:59:33 GMT
References: <261@ecijmm.UUCP>
Reply-To: malcolm@otc.UUCP (Malcolm Purvis (vacation student))
Organization: OTC Development Unit, Australia
Lines: 60

In article <261@ecijmm.UUCP> jmm@ecijmm.UUCP (John Macdonald) writes:

	[Stuff about pclose(3) hanging when there is more than one child...]

>Is this behaviour common to all System 5 variations?  To BSD
>derivatives?  SunOS?  AIX?  Your favourite here?

	It's all of them as for as I know. See below.
>
>Is there even a good general solution?  I can see only one good way
>to handle all of the variations of some routine wanting to wait for
>a specific child and getting the termination info for a different child
>instead (which will eventually be waited for - perhaps by a totally
>different routine).  That would be to provide some new library routines:
>waitfor( child, &status ) and postwait( child, status ).  Waitfor would
>wait for a specified child (and save information internally on any other
>children that terminate in the meantime).  Postwait would allow a routine
>that had done a wait call and gotten the termination for a child that
>it didn't know to pass that info into the mechanism for saving used by
>waitfor.  These routines could be used internally by pclose, system, and
>any other library routines that have waiting for a specific child as a
>part of their semantics, as well as being provided to the user as a new
>pair of library routines for building additional capabilities that include
>forked children as a part of their implementation.
>-- 
>--
>John Macdonald

	I had to solve this problem recently for a C++ subprocess class and
as John said it stems from the inability to wait for a specific child to
die; you can only wait for the next one.  The solution I came up with under
BSD4.3 was to put a structure associated with each child in a linked list
and use a SIGCHLD handler to do the actual waiting.  When the signal
arrives, the wait(2) is done in the handler and the list is searched for the
pid of the dead child and, when it is found, it is marked as dead and taken
off the list.  The inside of the waitfor() call would then look something
like:

		while (!child.dead)
			sigpause(0); /* wait for this process to die. */

		*status = child.return_value;

	This works wonderfully in C++ because as the child structure goes
out of scope the child process is automatically waited for, so stopping
children not having a data structure associated with it, and also each caller
gets the right return value. I don't know, however, how you could do this in
C.  The only problem with all this is that is falls over if the child dies
before it gets inserted into the list (eg: The exec() fails because the
program isn't there) so care must be taken over the order of list insertion
and forking.

	I hope this helps.

			    Malcolm Purvis (vacation student)
			    |||| OTC ||

			ACSnet:  malcolm@otc.oz
			  UUCP:  {uunet,mcvax}!otc.oz!malcolm
			 Snail:  GPO Box 7000, Sydney 2001, Australia