Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!att!ucbvax!dog.ee.lbl.gov!elf.ee.lbl.gov!torek From: torek@elf.ee.lbl.gov (Chris Torek) Newsgroups: comp.lang.c Subject: Re: Catching termination of child process and system() call Message-ID: <9882@dog.ee.lbl.gov> Date: 14 Feb 91 07:21:11 GMT References: <1991Jan24.023750.19569@tkou02.enet.dec.com> <14965@smoke.brl.mil> <1991Jan25.022950.10683@tkou02.enet.dec.com> <14977@smoke.brl.mil> <1356@geovision.UUCP> Reply-To: torek@elf.ee.lbl.gov (Chris Torek) Followup-To: nowhere Organization: Lawrence Berkeley Laboratory, Berkeley Lines: 73 X-Local-Date: Wed, 13 Feb 91 23:21:11 PST (This really belongs in a Unix newsgroup; however, I expect no further followups, i.e., I think this will be the decisive answer.) In various articles (see the references line) Doug Gwyn and Norman Diamond argue over the type of the argument to wait(2). In article <1356@geovision.UUCP> pt@geovision.gvc.com writes: >Sorry to add to this 'did not- did too' level of discussion, but a >"man 2 wait" on several machines shows [both]. Although I am a known BSDite (`BSD pervert' to some :-) ), I have to side with Doug here. The mess came about for historical reasons. In the days of Version 6 Unix, there was only one wait() system call; it took a pointer to int. V6 begat V7 and PWB; PWB grew (via a long and convoluted path) into System V while V7 grew into 32V and eventually to 4BSD. (There were various cross-fertilizations along the way, but by and large the systems split apart sometime between V6 and V7.) As Doug has already noted, certain persons who shall remain nameless--- not to protect the guilty, but rather, simply, because I am not certain who---changed both wait() and wait3() at about the same time as job control (and wait3() itself) were added to the Berkeley kernel. (Wait() and wait3() were in fact the same system call, distinguished by, of all things, the condition codes in the VAX PSL. The whole setup was a botch. Fortunately, all is now repaired.) Since wait3() could and did return more information than did wait()%, it seemed convenient to make a union describing the different return values. While all this went on, no one changed the kernel: the union was carefully tailored to match the actual kernel code, which still used `int's. ----- % Ignore that masked ptrace() behind the curtain ----- Because the kernel was unchanged, the fields in the union were byte order dependent. When 4.3BSD was ported to the Tahoe, a big-endian machine, our industrious kernel hackers added byte-order macros and made use of them in defining the wait union. This made the same names work on the two different machines. Unfortunately, the resulting union definition was still not right: the byte order of any given machine does not uniquely determine the bit order of that machine. With the advent of POSIX our industrious kernel hackers finally gave up, sighed, and replaced the union with accessor macros. Meanwhile, on all those machines that still use the old Berkeley union, it `just happens' (for the reasons given above) that `int's also work. New machines that conform to POSIX standards will use `int's. Therefore, all new software should use `int's. The new Berkeley will still work with old software as well (there is some hackery in the accessor macros to accomplish this). The answer, then, is that to wait for a process whose id is `pid' you should use: int w, status; if (check_other_wait_results(pid, &status)) /* if necessary */ while ((w = wait(&status)) != pid) { if (w == -1 && errno == EINTR) /* ugly but sometimes... */ continue; /* ...necessary */ record_other_wait_result(w, status); /* if necessary */ } The exit status of the process, if any, is then `status >> 8' and the signal, if any, that caused the process to die is then `status & 0177'. The process left a core dump (`image' or `traceback data' to non-Unix folks) if `status & 0200' is nonzero. This *will work* on systems that currently have the union. It will draw warnings from lint, but then, lint does not know *every*thing. -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab EE div (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov