Xref: utzoo comp.unix.questions:26919 comp.unix.internals:1026 Path: utzoo!attcan!uunet!samsung!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!att!cbnewsl!kc From: kc@cbnewsl.att.com (keith.coulson) Newsgroups: comp.unix.questions,comp.unix.internals Subject: Re: alarm () going off too soon Summary: race condition with signals Message-ID: <1990Nov14.200720.445@cbnewsl.att.com> Date: 14 Nov 90 20:07:20 GMT References: <1990Nov14.163443.4991@cscnj> Followup-To: comp.unix.questions Distribution: na Organization: AT&T Bell Laboratories Lines: 81 In article <1990Nov14.163443.4991@cscnj>, kevin@cscnj (Kevin Walsh) writes: > I am working with an application which runs on Amdahl's UTS 1.2, which is > their port of UNIX system V. In this application there are numerous instances > where a blocking read is initiated on a message queue using "msgrcv" and a > time-out is implemented using the alarm () and signal () functions. The typical > scenario is like this: > > msg_timeout () > { > timed_out = 1; > } > . > . > . > > timed_out = 0; > signal (SIGALRM, msg_timeout); > alarm (5); > > msgrcv (q_id, buffer_addr, q_size, 0, 0); > > alarm(0); > signal (SIGALRM, SIG_IGN); > > if (timed_out) { > /* handling for time-outs */ > } > . > . > . > > Most of the time, everything works as expected; if the timer expires the > blocking read is interrupted and the alarm clock is cleared, otherwise an > incoming message is read and the blocking read returns and again the alarm > clock is cleared. In both cases the code check if the time-out flag has been > set by the function handling the SIGALRM signal. > > The problem is that sometimes (an not always in the same module), the alarm > clock appears to expire at the instant it is being set. When this happens, > the alarm expires before the blocking read is even initiated. The result is > that the call to "msgrcv" block with no time-out -- until a message is received. Looks like you are context switching after "alarm (5)" and your process does not get back in until after the alarm expires. This is one of several race conditions associated with System V signals. I havent tried it but this might help ... msg_timeout () { msgflg = IPC_NOWAIT; } . . . msgflg = 0; signal (SIGALRM, msg_timeout); alarm (5); msgrcv (q_id, buffer_addr, q_size, 0, msgflg); /* use msgflg */ if (errno == EINTR || errno == ENOMSG) { /* handling for time-outs */ } else { alarm(0); signal (SIGALRM, SIG_IGN); } . . . This will cause msgrcv to use IPC_NOWAIT if the timer expires before it, and hence it will not block if there is no message. It wont fix the problem altogether but it should reduce its occurence a lot. You could also try raising the process priority for the alarm and msgrcv calls.