Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!swrinde!elroy.jpl.nasa.gov!decwrl!pa.dec.com!decprl!decprl!boyd From: boyd@prl.dec.com (Boyd Roberts) Newsgroups: comp.unix.wizards Subject: Re: Can a process stop with a locked inode? Message-ID: <1991Jun26.093356.12661@prl.dec.com> Date: 26 Jun 91 09:33:56 GMT References: <4200@island.COM> <1991Jun25.232436.1215039@locus.com> Sender: news@prl.dec.com (USENET News System) Reply-To: boyd@prl.dec.com (Boyd Roberts) Organization: Digital Equipment Corporation - Paris Research Laboratory Lines: 45 Nntp-Posting-Host: prl313.prl.dec.com In article <1991Jun25.232436.1215039@locus.com>, richard@locus.com (Richard M. Mathews) writes: > > It sounds like that is what is happening. This is possible if you ever > sleep at pri>PZERO while the inode is locked. This is nonsense. When the inode is locked no process, apart from the one with the lock, can operate on it. Sleeping with the inode locked may make things worse, but the priority is irrelevant. For completeness, I'll just add that priorities > PZERO are interruptable. > First check the wchan > of the "ls" process. If it points at the incore inode, then you know > SOMEONE has the inode locked, the "cp" is a good candidate, and you know > that since it did get stopped it must have been at pri>PZERO; thus this > is almost definitely the problem. Well I'd hardly call it a problem. The `cp' will have the inode locked while it is doing I/O on it, or stat(2)ing it, or the directory it's in will be locked during open/creat/unlink. You should really be asking whether: 1. Is the cause of this due to the inode being locked? 2. If so, does `cp' has the inode locked? 3. if so, for how long? 4. Why is it locked for so long? In practice this isn't a problem. > If a quick glance at the 2nd arguments to your sleep calls doesn't find > the bad sleep call, you could use "crash" or equivalent to look at the > kernel stack of the "cp" process. (If a program to find the kernel > stack is not available for you, you might have to check out page table > entries or page pointers in the proc structure to find that process's > kernel stack.) That should help you find exactly where the process is > sleeping. Unless you're really sure about what you're doing, any kernel data you observe will just confuse you. Even in a static system (a crash dump) it is often far from obvious what is going on. Maybe the `problem' isn't with `cp'. Boyd Roberts boyd@prl.dec.com ``When the going gets wierd, the weird turn pro...''