Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!mcsun!sunic!mosh!rolff From: rolff@mosh.UUCP (Anders Rolff) Newsgroups: gnu.bash.bug Subject: Re: jobs.c and nojobs.c (Enhancement request for SYSV machines) Message-ID: <128@mosh.UUCP> Date: 15 Jan 90 20:56:54 GMT References: Distribution: gnu Lines: 80 kayvan@mrspoc.transact.com (Kayvan Sylvan) writes: >Now, jobs.c implements a lot of nice features that seem to me to not >rely on JOB_CONTROL per se. Remembering child processes (and giving >the user access to the background processes) are things that bash can >do regardless of whether or not those processes can be restarted. I tried to implement something similair to your suggestions for SysV.2 but failed due to what I think was a bug in D-NIX (a SysV-clone). The main problem with SysV is that it doesn't support signal-queueing; bash needs to syncronize access to certain tables used by both the normal code and the SIGCHLD-handler and thus needs to block SIGCHLD every now and then. That's no big deal, however, but it adds a quantity of unreliability. Well, what about the bug? This piece of code is taken from wait_for() in jobs.c bash-1.03. The SysV-stuff is written by me. my_write() is for debugging purposes only. #ifdef SYSV if (child->running || job != NO_JOB) { my_write("wait_for: Waiting for children to die\n"); /* * If there are pending SIGCLD signals don't go to sleep. * See, they will be delivered when SIGCLD is released and * pause() will sleep until another signal arrives (and * that could take a while...) * * Since we're dealing with SysV a race condition may occur. * Not much to do about it, keep your fingers crossed! */ if (sigcld_has_arrived) release_sigcld(); /* flush_child() is called */ else { /* If flush_child is called here, we are dead meat! */ my_write("wait_for: We're pausing!\n"); release_sigcld(); pause(); } my_write("wait_for: Dead and buried!\n"); block_sigcld(); goto wait_loop; } #else if (child->running || ((job != NO_JOB) && (JOBSTATE (job) == JRUNNING))) { sigpause (0); goto wait_loop; } #endif If an ordinary child (synchronous) is created, wait_for() sigpauses to wait for the child to die. After a while, a SIGCHLD signal is received and flush_child is called. This works perfectly alright in the BSD-case (as usual...). My SysV code should do the same, in my opinion. What happened was that bash wrote "wait_for: we're pausing" and paused until I pressed ^c (SIGINT); then flush_child() was called. The obvious reason is that SIGCLD was not delivered, so I wrote a little test program that ptrace:d bash and logged when a SIGCLD was received. The test program received an endlessly amount of SIGCLDs. Death and hatred to mankind! I had a little conversation with DIAB, responsible for D-NIX, but they never came up with any solution. Disclaimer: I was working with this in November -89, some details may be incorrect. --Anders