Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watnot!watmath!clyde!cbatt!ucbvax!cs.brown.edu!jb From: jb@cs.brown.edu.UUCP Newsgroups: mod.protocols.tcp-ip Subject: Re: Domain host TTL fields Message-ID: <8702262115.AA02105@ucbvax.Berkeley.EDU> Date: Thu, 26-Feb-87 14:04:02 EST Article-I.D.: ucbvax.8702262115.AA02105 Posted: Thu Feb 26 14:04:02 1987 Date-Received: Sat, 28-Feb-87 02:54:55 EST Sender: daemon@ucbvax.BERKELEY.EDU Distribution: world Organization: The ARPA Internet Lines: 54 Approved: tcp-ip@sri-nic.arpa Over time, my idea of what the optimum time should be has been increasing. In general, I feel that 24 hours is about the correct value. One major issue is how long various other software will wait for a change. Sendmail will attempt to deliver a message for 3 days (as distributed). One would like to have any changes seen in less than 3 days. There are a couple reasons for data to change. First, a planned change to the network configuration. This can be planned for in advance by reducing the TTL. Don't forget that the reduction must be made at a time longer than the TTL in advance. Consider how long in advance you would be planning a move. Another reason for a change is due to an unanticipated failure. If one of your primary machines (such as a mail forwarder) goes down for a few days, attempts to bypass the failure require the length of the TTL to be fully realized. Coming from Berkeley and being involved with some of the early distributions of BIND, I'll admit we made a mistake in what we had in the sample files. Many people just copied our samples and did not analyze the situation. Our samples should have had TTL's that were longer than 1 hour. We did not realize this originally ourselves and were guilty of using too short of a TTL for a long time. These problems take time to work out. As far as the question of what should be used as the timeout waiting for a reply, I'm not sure of what is the correct answer. There are 3 timeouts to consider in this case. First, total time to wait for any response before indicating a failure. Second, the time between trying different servers for the domain. And third, the time between tries to the same server. The first of these is a user interface question on one hand, and a performance issue on the other. How long should a user who tries to telnet to some host have to wait before being told that the host is unknown (possibly only temporarily)? I don't like to wait a long time, but on the other hand, the longer the wait the more likely to succeed. BIND is currently using about one minute for this. The other two are intertwined and also are a part of the first one. UDP which is used primarily for queries is not reliable. If one knows that the original packet was lost, then a retry to one of the servers is in order. If the delay is in network round trip time (RTT), then the time between the retries should be lengthened. To decide what these times should be, several questions to be answered. How long should the user wait for a response? How many queries total should be sent out in trying to resolve the name? How many queries should be made to each server for the domain? What should the retry algorithm be (linear, exponential, something else)? If recursion is being done by another process, how does that affect these values? I'm not sure what is being used in BIND at the moment. It actually uses two different algorithms. One for talking to the local server, and another for dealing with recursion. Some work on the algorithms has been done for the most recent release and I haven't had a chance to look at the code. Jim Bloom