Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!cs.utexas.edu!uunet!mcvax!ukc!strath-cs!nott-cs!ucms!dave From: dave@ucms.UUCP (Dave Settle) Newsgroups: comp.unix.questions Subject: Re: Insufficient Resource Error on msgsnd Call Keywords: msgsnd, msgrcv, Ingres, UNIX, System V, message queues, Sun, SunOS4.0 Message-ID: <191@ucms.ucms.uucp> Date: 25 May 89 12:00:44 GMT References: <1023@dinl.mmc.UUCP> Reply-To: dave@ucms.UUCP (Dave Settle) Organization: Universal (CMS) Ltd, Leicester, UK Lines: 71 In article <1023@dinl.mmc.UUCP> noren@dinl.UUCP (Chuck Noren) writes: >We have been developing a application that uses System V message queues >(perhaps thats the first mistake :-)) for interprocess communication. >Everything has worked fine until we really wanted to stress test the >application by sending it hundereds of messages at once. The application >chugs away nicely until it hangs. > > >First a model of the application. It consists of three processes (call them >A, B, and C), and two message queues (call them 1 and 2). The processes >and queues are orgainized as: > > +--------+ +--------+ +--------+ > | | +-----+ | | +-----+ | | > | Proc A |---->>| Q 1 |-->| Proc B |---->>| Q 2 |-->| Proc C | > | | +-----+ | | +-----+ | | > +--------+ +--------+ +--------+ > >Process A generates 200 messages in bursts of about 50 as fast as it >can go (CPU bound) and puts it into Queue 1. Process B reads the >messages from Queue 1, processes them while looking things up in >an Ingres database (we are using Ingres 5.0). Process B sends even more >messages to Queue 2 which is read by Process C. > >After Process A sends 150 messages (and Process B deleivers more messages), >Process B tries to write to Queue 2 and hangs (using IPC_WAIT on the msgsnd >call). Queue 2 looks empty because Process C is blocking on it (using >msgsnd with IPC_WAIT). Queue 1 appears full because when I try to write >to it with a no-wait (using diagnostic software), >it returns with an errno of 11 (the Sun 3 manual >indicates this is caused by a fork with process limit exceeded or insufficent >resources). Trying to write to Queue 2 produces the same error. The error EAGAIN, to which you refer here, is used in a specific manner by the 'msgsnd' call to mean 'No more space available to store your message' >Any suggestions of what could be happening? Is Ingres using resources >common to Message Queues? Have I shown a misunderstanding of how message >queues are to be used? > From what you have written, I suspect that the problem is that you have exceeded the GLOBAL message space buffer size with writes to Q1. P2 can therefore not write to Q2 (no more message space), so nobody can proceed. You have confused me by stating both that you cannot write to Q2, and that P3 is sitting trying to read from it, and also that you can cure the problem for a bit by READING from it - isn't P3 trying to do just that? But generally, the system has a serious flaw: if Q1 causes the system message space to fill up, then P2 is stuck - it can't write messages to Q2 'cos the system is full, but on the other hand, it can't free some space until it has disposed of the message and read Q1. I think that you can configure the system (look in /etc/master) so that you can force the individual queue to be full, before the whole system is full. I've never done this, though, so I might be wrong. You might also consider using 'crash' to examine the state of the kernel message queue structures: it's very useful for things like this. Cheers, Dave -- Dave Settle, Universal (CMS) Ltd, Thames Tower, Burleys Way, Leicester, UK. dave@ucms.co.uk (someday) ...!mcvax!ukc!nott-cs!ucms!dave dave@ucms.uucp (today) <--- This way to point of view --->