Xref: utzoo comp.mail.sendmail:984 comp.protocols.nfs:398
Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!pt.cs.cmu.edu!andrew.cmu.edu!cfe+
From: cfe+@andrew.cmu.edu (Craig F. Everhart)
Newsgroups: comp.mail.sendmail,comp.protocols.nfs
Subject: Re: How do I recover from NFS hangups from within sendmail?
Message-ID: <IZ3FBHS00VsLI9vkhH@andrew.cmu.edu>
Date: 12 Sep 89 14:31:47 GMT
References: <1072@utkcs2.cs.utk.edu>
Organization: Information Technology Center, Carnegie Mellon, Pittsburgh, PA
Lines: 24
In-Reply-To: <1072@utkcs2.cs.utk.edu>

We in the Andrew project at CMU gave up on using sendmail to touch
anything but files that were on the machine's local disk, for reasons
much like what you outlined.  We wound up re-writing the whole local
transport mechanism for AFS (Andrew File System--yes, not NFS) so that
it would be sensitive to the existence of transient failures.  Not only
that, but the AFS developers were working in the next-door offices, so
we had an ``opportunity'' to make sure that transient errors were
distinguishable from persistent ones by returning different values in
errno.  (Thus, an open()-for-reading that fails with an errno of ENOENT
is an authoritative statement of the absence of some file or directory,
while other errno values, such as ETIMEDOUT, are returned to indicate
some transient problem such as a server or network outage.)

Two things:
(1) we expect that all of this local mail delivery system (AMDS, Andrew
Mail Delivery System) will be available on the X11R4 tape under
contrib/andrew; and
(2) Does NFS have some collection of rules for indicating transient vs.
persistent failures?  What are they?  Whatever they are, I'm real
interested in finding out, and they could be the way out for Keith
Moore's problems, too.

		Thanks,
		Craig Everhart