Xref: utzoo comp.unix.ultrix:7400 comp.protocols.nfs:2372
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!wuarchive!uunet!mcsun!ukc!strath-cs!baird!jim
From: jim@cs.strath.ac.uk (Jim Reid)
Newsgroups: comp.unix.ultrix,comp.protocols.nfs
Subject: Re: nfsd 4, why, and how to tune...
Message-ID: <JIM.91May28110939@baird.cs.strath.ac.uk>
Date: 28 May 91 10:09:39 GMT
References: <119@janis.UUCP> <RUSTY.91May24113908@groan.Berkeley.EDU>
	<21936@cbmvax.commodore.com>
Sender: jim@cs.strath.ac.uk
Organization: Computer Science Dept., Strathclyde Univ., Glasgow, Scotland.
Lines: 37
In-reply-to: grr@cbmvax.commodore.com's message of 26 May 91 23:20:38 GMT

In article <21936@cbmvax.commodore.com> grr@cbmvax.commodore.com (George Robbins) writes:

   I'm really curious whether the Ultrix behavior is a result of bugs or
   simply the way that all NFS servers act.  The worst case seems to be "find"
   which reads "directories" rather than "files", which I believe are different
   classes of operation under NFS.  It may be that "stateless" behavior that
   NFS implements turns sequentially "reading" a directory into some highly
   cpu intensive search and search again algorithm.

   [ for c.p.nfs types: a client doing a "find" against an Ultrix NFS exported
     filesystem brings the server to it's knees, with the NFS deamons sharing
     ~100% of the CPU time amongst themselves...  Ouch.  This happens often
     enough to be a recognizable syndrome and prompts a witch hunt to find
     which client is up to mischief ]

Any recursive directory traverse via NFS can be painful (du is just as
bad as find). This is because the client makes LOTS of NFS requests -
several read directory entries to get the file names and the file
handles followed by a get file atributes request for each file. If the
client is faster at sending these out than the server is at replying,
this is bad news. The server will be bombarded with NFS requests which
it can't service quickly enough. The requests timeout, so the client
sends them all over again, saturating the server once more and closing
the loop. Another nasty is that the client and server file attribute
caches will get flushed and filled with entries from the traverse.
This can mean that heavily used cache entries have been removed to
make way for those at the tail of the directory traverse.

Increasing the number of nfsds on the server may help in this
situation, but I doubt it. [It's already working the disk as hard as
it can so another nsfd process to enqueue requests to the server's
disk driver isn't going to help much.]  A better solution will be to
experiment with increased values for the timeout and restransmission
NFS mount parameters ON THE CLIENTS. This will make them behave less
agressively when the server is having a hard time.

		Jim