Path: utzoo!attcan!uunet!cs.utexas.edu!rutgers!soleil!mlb.semi.harris.com!thrush.mlb.semi.harris.com!del From: del@thrush.mlb.semi.harris.com (Don Lewis) Newsgroups: comp.protocols.tcp-ip.domains Subject: Re: BIND bug list Message-ID: <1990May30.094653.8584@mlb.semi.harris.com> Date: 30 May 90 09:46:53 GMT References: <1990May17.083447.6880@mlb.semi.harris.com> <25358@netnews.upenn.edu> Sender: news@mlb.semi.harris.com Organization: Harris Semiconductor, Melbourne FL Lines: 56 In article <25358@netnews.upenn.edu> hagan@DCCS.UPENN.EDU (John Dotts Hagan) writes: > >Anyways, it think it would be real neat of the resolver did some kind of >performance/reliability remembering when going at its list of possible name >servers to use. > >As it is now, we have three name servers for our campus (one is primary, and >two secondaries). One of the secondaries is listed first in everyone's >resolv.conf (or equivilent list of servers), and then the primary, and then >the second secondary. > >When the first listed secondary dies (either named dumps core and leaves, or >the system is toasted), everyone's resolver gets slow since it patiently tries >to query the first listed name server, then after a timeout moves on the the >next one, and so forth. However, it does not remember that it just had some >trouble with the first server, and tries it again for the next request. You might want to list each of these first in one third of the hosts in order to better distribute the load. This way, only 1/3rd of the hosts will slow down when one of the servers dies (but this will happen three times as often). > >It would be great if the first user who tries a telnet (or whatever) suffered >the hit once for that host, then other tries would quickly just go at a working >name server. Perhaps dead name servers could be routinely retried and some >stats kept on them (I think bind already does this sort of thing when dealing >with the list of root servers, so at least there is some precedent for this >kind of behavior). > Well, there is sort of a problem here. You probably have no such thing as *the* resolver. Everything that you run that wants to do host<->address translation uses the resolver library routines and is a separate instance of a resolver. Each time you fire up telnet, it starts up from scratch and has no history available concerning the status of the various servers. If a particular process does a lot of host<->address translations, then it probably could figure out what was going on and make use of this information, but if it only does one translation, by the time it figures out which server is the hot one to use, it has no further need to use it. I suppose that you could read this information from a file and update it, but then you have to be able to handle multiple simultaneous accesses and updates to this file 8-( If you are running a somewhat modern BIND (>4.8?), then you can run it on each host and configure it to forward all its queries to the campus servers. BIND appears not to keep track of the performance of its forwarders, so I suppose that would be better if it did something like what it does for the root servers. Running BIND on each host also has the advantage that the answers to frequently asked questions are cached locally on the host which will reduce the load on the campus servers. Be forwarned that the forwarding stuff doesn't quite work right even in 4.8.1. Hopefully there will be a cleaner release soon. -- Don "Truck" Lewis Harris Semiconductor Internet: del@mlb.semi.harris.com PO Box 883 MS 62A-028 Phone: (407) 729-5205 Melbourne, FL 32901