Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ncar!boulder!ccncsu!longs.LANCE.ColoState.Edu!steved
From: steved@longs.LANCE.ColoState.Edu (Steve Dempsey)
Newsgroups: comp.unix.ultrix
Subject: Re: nfs daemon blocks system.
Keywords: NFS, blocked/hung system
Message-ID: <1738@ccncsu.ColoState.EDU>
Date: 24 Apr 89 06:51:22 GMT
References: <11582@s.ms.uky.edu> <278@kubix.UUCP> <1659@eric.mpr.ca>
Sender: news@ccncsu.ColoState.EDU
Organization: Colorado State University, Fort Collins, CO  80523
Lines: 44


> In article <1659@eric.mpr.ca> parker@waters.UUCP (Ross Parker) writes:
> >In article <278@kubix.UUCP> mvw@kubix.UUCP (Maarten van Wijk) writes:
> >Yes! We've been having a problem with NFS that appears to be caused by
> >PCs on our network (using Sun's PC-NFS). Every once in a while the
> >nfs daemons on one of our microvaxes (Ultrix 2.2 or 2.3) will just go
> >bananas and eat up most of the CPU. We run 8 nfs daemons, and for the
> >space of about 5 minutes (sometimes less), each will chew up about
> >10 percent of the CPU. This drives the load up to a level where everyone
> >has to sit and wait for this to die down before they can work again.
> ...
> >If anyone has any ideas, I'd certainly like to hear them!! DEC support
> >is clueless so far.
> 
> We have the same sort of problem .. about 5 or 6 times a week one
> of our uVaxIIen will lock up as you describe.  We do not run PC-NFS
> but we do have some Sun's (v4 of SunOS) and a Sequent (v3.?? of Dynix)
> and all these guys share NFS back and forth.  We are at v3 of Ultrix.
> 

All this talk of stuck NFS servers, etc. sounds very familiar.  We
have quite the variety of hardware and software: Vax780's, '730,
uVaxII, '3600's, 3200's, SUN3/50's, and many VS2000's; most running
Ultrix2.2, the '780's running 4.3BSD+XINU.  Ethernet is DELQA on the
newer machines along with Proteon P1100's (proNET-10 ring).  Every
machine mounts at least one remote file system, and some make 2 or 3
gateway hops to get there.  Machines on the same physical net do just
fine.  Different gateways seem to cause different problems: lots of
timeouts, but they reasonable return (a few seconds), some LONG
timeouts, and some just hang forever.  Usually the client hangs, but
sometimes the server all but locks up as described above.

So what causes all this?  Beats me, but our solution is to fix the
read and write size to something smaller than a packet.  That's
options rsize=xxxx,wsize=xxxx in /etc/fstab.  We chose 1024 because
both proNET and ethernet tcp/ip packets are a few hundred bytes larger
than 1K.  All our NFS problems seem to have disappeared.  Of course
this solution was discovered completely by trial and error (and error
and .... :-)

        Steve Dempsey,  Center for Computer Assisted Engineering
  Colorado State University, Fort Collins, CO  80523    +1 303 491 0630
INET: steved@longs.LANCE.ColoState.Edu, dempsey@handel.CS.ColoState.Edu
UUCP: boulder!ccncsu!longs.LANCE.ColoState.Edu!steved, ...!ncar!handel!dempsey