Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!wuarchive!rice!rice!sun-spots-request
From: eirik@theory.tn.cornell.edu (Eirik Fuller)
Newsgroups: comp.sys.sun
Subject: tmpfs
Keywords: Miscellaneous
Message-ID: <1990Oct7.231415.4181@rice.edu>
Date: 7 Oct 90 21:30:00 GMT
Sender: sun-spots-request@rice.edu
Organization: Sun-Spots
Lines: 17
Approved: Sun-Spots@rice.edu
Originator: spots@walhalla.rice.edu
X-Sun-Spots-Digest: Volume 9, Issue 339, message 5

Our Sun 4/280 running SunOS 4.1 frequently gets into an unusable state, in
which numerous processes are stuck in disk wait, all of them in tmpfs.
This artificially increases the load average (by about one per wedged
process), and eventually it causes telnetd to close connections on new
logins.

Kernel stack traces on the wedged processes show that all of them are
sleeping at priority 10 in _tmpnode_lock, called by _tmpnode_get, called
by _tdirlookup, usually (but not always) called by _tmp_lookup.  We don't
have SunOS 4.1 source code, so I'm not conveniently able to explore the
problem much further.

Has anyone else seen this problem?  Any suggestions for workarounds until
we can get a fix would be welcome.  The processes always hang in the same
directory, which usually has mode 700.  The owner of that directory
apparently backgrounds jobs which use that directory, so this could be a
concurrency problem of some sort.