Path: utzoo!attcan!uunet!lll-winken!gauss.llnl.gov!casey
From: casey@gauss.llnl.gov (Casey Leedom)
Newsgroups: comp.protocols.tcp-ip
Subject: Re: Reconciling /etc/hosts, yp, and named?
Keywords: /etc/hosts yp named
Message-ID: <16587@lll-winken.LLNL.GOV>
Date: 13 Jan 89 18:35:04 GMT
References: <1737@ardent.UUCP>
Sender: usenet@lll-winken.LLNL.GOV
Reply-To: casey@lll-crg.llnl.gov.UUCP (Casey Leedom)
Distribution: na
Organization: Lawrence Livermore National Laboratory
Lines: 115

| From: mec@ardent.com (Michael Chastain)
| 
| How do you reconcile /etc/hosts, yellow pages, and named?

  The way that Sun has things set up works pretty well.  Additionally, if
you went that route you'd be operationally compatible with Sun OS on this
issue.  This second point is nearly as important as the original
problem.  People will spend no end of time bitching at your company if
they have to something weird and different on your machines for no
functional gain.  That works its way toward a dissatisfied customer base
Real Quick.

  In any case, their gethostby*() routines try to resolve names via YP
and if YP isn't available, use /etc/hosts.  Ypserv if forked off normally
will deal with a dbm version of /etc/hosts created from /etc/hosts.  But,
if YP is forked off with a "-i" flag, when a resolution request comes
down the line that ypserv can't handle, it hands it off to the name
server and then returns it's results as if it had figured out the answer.

  These layers of operation mean that you have to run YP and BIND if you
want BIND functionality, but normally that's not a problem.  Two
particular problems are worth mentioning however (below I outline a
scheme that doesn't suffer this problem):

    1. The YP protocol doesn't have the ability to return the answer ``I
  can't tell if the host exists or not - I timed out trying to get the
  answer for you'' I.e. a name server request timed out.  This occurs as a
  pathological problem when a gateway to a large chunk of the network goes
  out.  A client application will attempt to resolve a name, the request
  will end up in ypserv's hands, ypserv gets a time out from named, but
  because ypserv can't send that answer back to the client, ypserv simply
  exits.  Meanwhile, the client, not getting any response to its query,
  figures it got dropped by the network and retransmits its request ...
  Forever.  So you get a ypserv getting forked off once per second or so
  forever.

    Normally this isn't a problem because the client involved is someone
  typing ``telnet foo'' and they get tired after a while and hit ^C.  But
  when the client is an automated program like sendmail which doesn't get
  tired you have problems.  We've had situations where multiple sendmails
  will be running on multiple client machines and the combined YP traffic
  has dozens of ypservs being forked off per second.  The load average
  slowly climbs up past 16 as more and more sendmails hang waiting on name
  resolution and pretty soon the server machine crashes.

    It turns out that sendmail is basically the sole problem point along
  these lines in a standard configuration.  Bill Nowicki at Sun solved the
  problem by putting timeouts around all gethostby*() calls and we haven't
  had a problem since.  He set the timeout to 90 seconds which I feel is
  high, but that just means you have transient loads on your sever for 90
  seconds for each sendmail that times out.  Obviously a better solution
  would be to extend the YP protocol, but that would require a lot of work.

    2. The second problem is also associated with sendmail: MX records.  If
  you use sendmail in the configuration above, you get better service than
  if you just had a host table because you're getting name/address
  resolution for every host in the domain system that's on the internet,
  but you don't get to mail to hosts which have MX records set up.

    Bill Nowicki's solution here is to run a different version of sendmail
  that interfaces to the name server directly.  In fact, he simply uses the
  sendmail from Berkeley for this.  Note that this also solves the problem
  above since sendmail no longer goes through ypserv.  This has the
  disadvantage of requiring that you provide two sendmail binaries, but
  since you can compile them both from the same sources, that's not to
  great a problem.

  In the final analysis, I think that YP's best use is as a distributed
user/group/id name service.  I don't think that it makes a great
distributed host name/address service now that the DOMAIN system is
available.  It does have the significant advantage that you can simply
edit an old style host table and generate a YP host database from that
which is simpler than dealing with name server databases, but this is
only an advantage for an isolated network since any networks connected to
the internet will eventually be forced to setting up a name server
somewhere, and once you have a name server somewhere, it's trivial to set
up most of your hosts to either contact that name server or run a
secondary name server neither of which require complicated name server
databases.

  Given this and the fact that the DOMAIN system is here to stay, I'd say
that I'd be tempted to set up the library routines to try to contact a
name server first, use YP second, and finally look in /etc/hosts.  This
means that there'll be a slight delay any time anyone tries to resolve
something, but people will put up with lower performance a lot better
than lower functionality or ease of use.  Besides, the delay shouldn't be
that bad.

  You could do something like keeping track of what you're using in the
library so that an application would only suffer a delay on the first
resolution, but that would mean that if the name server or YP were
temporarily unavailable when an application first started it wouldn't use
either when service came back.  This would be a severe problem if the
application in question were a daemon or server of some sort.  Moreover,
it would lead to unpredictable behavior from a user's perception (two
people sitting next to each other might start the same application a
couple of seconds apart and one would get host table lookups throughout
the execution of the application while the other got name service.  People
would not be happy.

  I think that the best place to put the automatic switch is in the
res_*() routines.  In that configuration you could use stock BSD code for
everything except the res_*() routines which would have your additions to
lookup YP.  The res_*() routines already back off to host table lookups if
a name server can't be contacted - all you'd have to do is insert a YP
stage in between.

  Finally, I'd just use the standard MX sendmail in the above
configuration.  It's MX queries would time out if there wasn't a name
server available and it's subsequent res_*() for address would go through
the above automatic switch between BIND, YP, and host table lookup.

  Hope this helps.

Casey