Path: utzoo!mnetor!uunet!husc6!rutgers!rochester!bbn!uwmcsd1!ig!agate!ucbvax!PANDA.PANDA.COM!MRC From: MRC@PANDA.PANDA.COM (Mark Crispin) Newsgroups: comp.protocols.tcp-ip Subject: trying multiple addresses Message-ID: <12374794814.7.MRC@PANDA.PANDA.COM> Date: 15 Feb 88 02:18:09 GMT Sender: daemon@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 120 I am sure the advocates of trying multiple mail addresses would feel quite differently if they had to pay per-packet charges for network access. Historically, only a small percentage of network connection failures -- typically less than 1% -- have been due to a dysfunctional IP address. The remaining (= overwhelming majority of) failures have been due to dysfunctional networks, dysfunctional hosts, or dysfunctional servers. It is possible that trying a different IP address may help in the dysfunctional network case, although typically the "non-best" IP addresses all involve the dysfunctional network in some way (look at some network topology maps some time). This is a relatively rare case anyway. Many times, the "non-best" IP address is substantially inferior to the point where it should not be used under ANY circumstance. No site outside of Stanford should *ever* use SAIL's, Score's, or SUMEX-AIM's net 36 IP address; the gateway between net 10 and net 36 (as well as the net 36 subnet from that gateway) is seriously overloaded. If I understand JLarson.pa correctly, he's saying that Xerox.COM will use SUMEX-AIM's net 36 address just because they couldn't connect to the net 10 address the last time. If this is common behavior it's no wonder those of us who must use the net 10/36 gateway find it so unusable. Will I have to instruct the servers on multi-homed net 10/36 hosts to refuse connections on net 36 from non-net 36 hosts to get them to stop? What about those guys multi-homed on a "free" and a pay-per-packet X.25 net? Do they appreciate this behavior? The *correct* solution to this problem is NOT kludgy algorithms in the mailer. The correct solution is multi-part, and involves: 1) complete the migration from the host table to the domain system. The NIC simply cannot keep up with the changes in network topology (as the Xerox experience showed), and, frankly, it's unreasonable for us to expect them to. 2) domain database managers need to keep their name servers updated with changes to network topology. TTL's should not be allowed to be so long that topology changes go unnoticed by resolvers for excessive periods of time. 3) better support needs to exist in the domain infrastructure for "best" IP address selection. This last point is important. Presently, it is up to the local host to decide upon a "best" IP address, based on quite incomplete information. Many hosts (all Unix hosts?) simply pick the first IP address listed in the NIC host table (or returned as A RR's from the domain system). TOPS-20 selects in priority order: (1) first IP address from a directly connected net that is "preferred" (e.g. a fast LAN), (2) first IP address from a directly connected net that is "default" (e.g. a core net such as ARPANET), (3) first IP address from any other directly connected net, (4) first IP address. "First IP address" means first from the address list from the host table (or a set of A RR's from the domain system). Note that there is nothing whatsoever to do with "net 10". Almost 100% of the time, this makes the best possible choice of an IP address. It's only in those very few cases (which come up perhaps 2 or 3 times a YEAR!!!) where an otherwise highly desirable path breaks for a long period of time that a problem comes up. I consider it highly objectionable to cycle through every other IP address (waiting a minute or more for an IP retransmission timeout if the network is courteous enough to tell me the other guy ain't there) every time I attempt to connect to a dead host. JLarson's suggestion is less objectionable, but it involves one piece of software (the mailer) telling a completely different piece of software (host table or domain resolver) that the IP address given it was sick. Nobody wants to do the work to the host table software to add such a feature. It might be doable with the domain resolver (SRA can comment on this); it certainly wouldn't be hard for the mailer to pass on the word to the domain resolver. The problem is, what does "this IP address was sick" really mean? How does "retransmission timeout" differ from "host dead" (a type 7 1822 message) differ from "host sent a reset" (refused the connection) differ from any of the other ways a connection failed? In which one(s) of these do you say try another IP address, and in which one(s) do you assume the host is really down, or really doesn't want to talk now? Again, what do you do about those cases when we really shouldn't be using a particular IP address because of charging, or other administrative issues? The domain system may be able to help; it was always my belief (I remember suggesting this at the meeting when the domain system concept was first invented) that nameservers should be allowed to tailor their responses based on who was asking the question. A domain query should be something like: "I am on net 128.43 seeking an SMTP server for FOO.BAR.COM, which is the best address for me to use?" and later on "I am on net 128.43 seeking an SMTP server for FOO.BAR.COM and I already tried 69.105.8.3, is there any other I should try?" The point is that a perfectly valid answer may be "if 69.105.8.3 ain't answering, he ain't up; try again later." This also gives the remote organization (which presumably knows the status of their hosts) control over the IP address selection criteria, based upon their knowledge instead of the local host's educated guesswork. Please, no flames. If you're going to babble on and on about how I should break my mailer to conform to your fantasy of how the world should work, send it to *NUL: or /dev/null or whatever you call it. Furthermore, I'm not interested in any comments about a host table based means of IP address selection. The systems I support do not use host tables (and, for the record, are currently the only TOPS-20's supporting MX mailing). I can't help but feel that if the problem of a sick "best" IP address happens to a domain-based mailer, that the fault is that of the management of the nameserver for that organization and not that of the mailer. If you have constructive observations, then let's talk. Remember that this is not about porting arguably "better" (or "worse") ideas from a 16-year-old operating system to a 19-year-old operating system. This is about what's going to be done in the next generation, that maybe will be ported to the 16 and 19 year-olds. I think we can do better than any of the guesswork, and we should, if the threats of pay-per-packet come to pass. -- Mark -- -------