Path: utzoo!attcan!uunet!lll-winken!gauss.llnl.gov!casey From: casey@gauss.llnl.gov (Casey Leedom) Newsgroups: comp.protocols.tcp-ip Subject: Re: Reconciling /etc/hosts, yp, and named? Keywords: /etc/hosts yp named Message-ID: <16587@lll-winken.LLNL.GOV> Date: 13 Jan 89 18:35:04 GMT References: <1737@ardent.UUCP> Sender: usenet@lll-winken.LLNL.GOV Reply-To: casey@lll-crg.llnl.gov.UUCP (Casey Leedom) Distribution: na Organization: Lawrence Livermore National Laboratory Lines: 115 | From: mec@ardent.com (Michael Chastain) | | How do you reconcile /etc/hosts, yellow pages, and named? The way that Sun has things set up works pretty well. Additionally, if you went that route you'd be operationally compatible with Sun OS on this issue. This second point is nearly as important as the original problem. People will spend no end of time bitching at your company if they have to something weird and different on your machines for no functional gain. That works its way toward a dissatisfied customer base Real Quick. In any case, their gethostby*() routines try to resolve names via YP and if YP isn't available, use /etc/hosts. Ypserv if forked off normally will deal with a dbm version of /etc/hosts created from /etc/hosts. But, if YP is forked off with a "-i" flag, when a resolution request comes down the line that ypserv can't handle, it hands it off to the name server and then returns it's results as if it had figured out the answer. These layers of operation mean that you have to run YP and BIND if you want BIND functionality, but normally that's not a problem. Two particular problems are worth mentioning however (below I outline a scheme that doesn't suffer this problem): 1. The YP protocol doesn't have the ability to return the answer ``I can't tell if the host exists or not - I timed out trying to get the answer for you'' I.e. a name server request timed out. This occurs as a pathological problem when a gateway to a large chunk of the network goes out. A client application will attempt to resolve a name, the request will end up in ypserv's hands, ypserv gets a time out from named, but because ypserv can't send that answer back to the client, ypserv simply exits. Meanwhile, the client, not getting any response to its query, figures it got dropped by the network and retransmits its request ... Forever. So you get a ypserv getting forked off once per second or so forever. Normally this isn't a problem because the client involved is someone typing ``telnet foo'' and they get tired after a while and hit ^C. But when the client is an automated program like sendmail which doesn't get tired you have problems. We've had situations where multiple sendmails will be running on multiple client machines and the combined YP traffic has dozens of ypservs being forked off per second. The load average slowly climbs up past 16 as more and more sendmails hang waiting on name resolution and pretty soon the server machine crashes. It turns out that sendmail is basically the sole problem point along these lines in a standard configuration. Bill Nowicki at Sun solved the problem by putting timeouts around all gethostby*() calls and we haven't had a problem since. He set the timeout to 90 seconds which I feel is high, but that just means you have transient loads on your sever for 90 seconds for each sendmail that times out. Obviously a better solution would be to extend the YP protocol, but that would require a lot of work. 2. The second problem is also associated with sendmail: MX records. If you use sendmail in the configuration above, you get better service than if you just had a host table because you're getting name/address resolution for every host in the domain system that's on the internet, but you don't get to mail to hosts which have MX records set up. Bill Nowicki's solution here is to run a different version of sendmail that interfaces to the name server directly. In fact, he simply uses the sendmail from Berkeley for this. Note that this also solves the problem above since sendmail no longer goes through ypserv. This has the disadvantage of requiring that you provide two sendmail binaries, but since you can compile them both from the same sources, that's not to great a problem. In the final analysis, I think that YP's best use is as a distributed user/group/id name service. I don't think that it makes a great distributed host name/address service now that the DOMAIN system is available. It does have the significant advantage that you can simply edit an old style host table and generate a YP host database from that which is simpler than dealing with name server databases, but this is only an advantage for an isolated network since any networks connected to the internet will eventually be forced to setting up a name server somewhere, and once you have a name server somewhere, it's trivial to set up most of your hosts to either contact that name server or run a secondary name server neither of which require complicated name server databases. Given this and the fact that the DOMAIN system is here to stay, I'd say that I'd be tempted to set up the library routines to try to contact a name server first, use YP second, and finally look in /etc/hosts. This means that there'll be a slight delay any time anyone tries to resolve something, but people will put up with lower performance a lot better than lower functionality or ease of use. Besides, the delay shouldn't be that bad. You could do something like keeping track of what you're using in the library so that an application would only suffer a delay on the first resolution, but that would mean that if the name server or YP were temporarily unavailable when an application first started it wouldn't use either when service came back. This would be a severe problem if the application in question were a daemon or server of some sort. Moreover, it would lead to unpredictable behavior from a user's perception (two people sitting next to each other might start the same application a couple of seconds apart and one would get host table lookups throughout the execution of the application while the other got name service. People would not be happy. I think that the best place to put the automatic switch is in the res_*() routines. In that configuration you could use stock BSD code for everything except the res_*() routines which would have your additions to lookup YP. The res_*() routines already back off to host table lookups if a name server can't be contacted - all you'd have to do is insert a YP stage in between. Finally, I'd just use the standard MX sendmail in the above configuration. It's MX queries would time out if there wasn't a name server available and it's subsequent res_*() for address would go through the above automatic switch between BIND, YP, and host table lookup. Hope this helps. Casey