Path: utzoo!utgpu!watserv1!watmath!att!rutgers!mit-eddie!bbn.com!usc!wuarchive!texbell!texsun!newstop!sun!terra.Eng.Sun.COM!brent
From: brent@terra.Eng.Sun.COM (Brent Callaghan)
Newsgroups: comp.protocols.nfs
Subject: Re: Is RPC in SVR4 implemented in kernel or user level?
Keywords: SVR4
Message-ID: <138870@sun.Eng.Sun.COM>
Date: 13 Jul 90 17:05:24 GMT
References: <1990Jul10.032917.13692@cbnews.att.com> <103795@convex.convex.com> <2503@sequent.cs.qmw.ac.uk>
Sender: news@sun.Eng.Sun.COM
Lines: 57

In article <2503@sequent.cs.qmw.ac.uk>, liam@cs.qmw.ac.uk (William Roberts) writes:
> In <103795@convex.convex.com> thurlow@convex.com (Robert Thurlow) writes:
> 
> >The kernel has to do RPC over UDP/IP for NFS accesses, so some files
> >are needed there.  More files are needed to do RPC for general user
> >level applications, and they live in the libraries.  The common files
> >in the Sun NFSSRC reference port are identical; our revision control
> >system has links in some underlying directories to ensure that the
> >changes made are made to both kernel and user level code.
> 
> Interesting. Back in NFS 3.0 days there was a significant
> difference in that kernel RPC is assumed to be for NFS purposes
> and so assumes idempotence, whereas that user-level RPC stuff
> doesn't make this assumption.
> 
> The actual difference this makes is in the handling of the xid
> when a request times out. In the kernel case (idempotent) the
> xid for the retransmission is the same as the original request
> and so a delayed reply to the original message would be
> acceptable. In the user level case (non idempotent) the xid is
> incremented so a delayed reply to the original request will
> actually be rejected.
> 
> The non-idempotence assumption is why the dreadful slowness of
> the mountd daemon actually causes severe problems (see annual
> discussions on this group). I have been saying for years that
> SunRPC should allow you to specify which behaviour you want,
> but nobody ever seems to take any notice.

This isn't right.  User-level Sun RPC has always had two levels
of timeout.  In clnt_create() you specify a retry timeout
to be used withing the RPC code.  Retries based on this timeout
keep the same xid.  Prior to SunOs 4.1 the timeout was constant 
i.e. if you specified 2 sec then the client will retry every
two sec until a response is received.  In SunOs 4.1 this was
changed to be an exponential backoff i.e. if you specify 2 sec
then the retry intervals will be 2,4,8,.. with a limit of 30 sec.

For each clnt_call() you can specify a total timeout.  Within this
total timeout the lower level RPC code will retry at intervals
based on the retry timeout.  Since a new xid is allocated for
each clnt_call(), the server will treat each clnt_call() as a
new request i.e. you cannot make use of a xid-based duplicate
request cache on the server if you're retrying with clnt_call's.

In SunOs 4.1 a duplicate request cache was implemented on the
server to detect duplicate NFS retransmissions.  Some changes
were required in the NFS client-side code for 4.1 to keep the
xid constant across retransmissions so that the duplicate request
cache would be effective.  This same change was made in Ultrix
V3.0 for the same reason (see Chet Juszczak's winter '89
USENIX paper).
--

Made in New Zealand -->  Brent Callaghan  @ Sun Microsystems
			 uucp: sun!bcallaghan
			 phone: (415) 336 1051