Path: utzoo!censor!avcocan!can503!story From: story@can503.UUCP (Robert Story) Newsgroups: comp.unix.xenix Subject: Stuck messages in queues with btrieve Keywords: kernel message bug Message-ID: <298@can503.UUCP> Date: 20 Sep 89 11:24:31 GMT Reply-To: story@avcocan.UUCP (Robert Story) Organization: Avco Financial Services, London, Ontario, CANADA Lines: 33 In article <295@can503.UUCP> I wrote of the following : >The problem seems to arise under heavy load, with 3 to 6 users all running >our financial application and printing documents. A process will msg to > btrieve and then set an alarm for 60 seconds and sit on the msgrcv call. >With a large load one or two of the processes will get the alarm signal. >Examination of the message queues with ipcs shows messages from/to Btrieve in >the queues but attempts to read these messages with msgrcv() and message type >set to zero show an empty queue. A call to msgctl() with IPC_STAT reports >messages in the queue but the pointers to the first and last messages are 0. >Subsequent messaging to btrieve carries on as normal. We had a person from SCO on site for a week and last Saturday found the problem. IT IS a kernel bug. If the kernel is copying to/from the user's data area and suffers a page fault then the kernel will put this process to sleep. In the meantime another process also using the message queues can steam through and do its thing. When the original process wakes up it will have had its pointers realigned and, of course, weird things begin to happen. Sometimes the free list turned up on queue 1 or queue 0 turned up on the free list. Which explains why ipcs thought that there were messages when there weren't. This problem has been fixed in the ATT 3.1 code and the SCO 3.2 code by using semaphores in the critical areas. I hope this helps others. It cost our company a lot of money to discover this one. This bug only surfaced before a major release so things were pretty tense here. I had a good time, though. It's not every day I get to assist in debugging kernel code. If anyone wants more details, please e-mail me. -- [ Robert Story ..{!utzoo!censor,!uunet!zardoz!avcoint}!avcocan!story ] [ SnailMail : AFS 201 Queens Avenue London Ontario Canada N6A 1J1 ] [ or : AFS 3349 Michelson Drive Irvine California USA 92715-1606 ] [ Voice : +1 519 672-4220 xtn 233 ]