Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!sun-barr!decwrl!mogul From: mogul@decwrl.dec.com (Jeffrey Mogul) Newsgroups: comp.protocols.tcp-ip Subject: Re: ether collision backoff Keywords: non-standard Message-ID: <208@jove.dec.com> Date: 17 Oct 89 22:37:58 GMT References: <42686@sgi.sgi.com> Organization: DEC Western Research Lines: 66 In article <42686@sgi.sgi.com> vjs@rhyolite.wpd.sgi.com (Vernon Schryver) writes: At last week's Interop, I was told that a large workstation company's products can be configured to use non-standard retransmission-after- collision delays. To my knowledge this is false rumor, which has been circulating for years. Is this true? Does it cause big problems, as my informant implied? Exactly which parameter(s) is(are) changed--the exponent, the multiplier, the algorithm? Does it really help NFS that much? Is there a relevant de-facto or de-jurie standard? Should Silicon Graphics do the same thing? Are the Protocol Police about to squash the idea? Others have already pointed out that this is an extremely bad idea. It might help to prevent misguided experimentation if I try to explain why it is a bad idea, instead of simply stating "it's against the law" (which, in fact, it is). The rationale behind the "binary exponential backoff" algorithm required by the standard is explained in Robert Metcalfe's PhD thesis, although the original CACM paper on Ethernet might be more available. At any rate, the point is that if there are Q hosts simultaneously wanting to send a packet on an Ethernet, then in order to avoid congestive instability each should attempt to transmit with probability 1/Q. The problem is that each individual host has no global knowledge about the number of other hosts wanting to transmit. It must therefore make an estimate of this number based on the only property of the network it can observe, namely the number of times it has tried to send the current packet and had a collision. Clearly, it is important that this estimate be nearly right, and that any remaining error be biased in the direction of inefficiency rather than congestive collapse. It turns out (if you believe the math in Metcalfe's thesis; I don't intend to check it) that the "binary exponential backoff" algorithm gives the right behaviour; the "estimate" of Q embodied in the backoff count (which is really the base-2 logarithm of Q) is close enough to work well even under heavy load. (See the paper I co-authored with Dave Boggs and Chris Kent in Proc. SIGCOMM '88; we measured a real Ethernet under moderate overload and showed that it does work.) One aspect of the Standard Ethernet is that, although a host is supposed to attempt transmission up to 16 times, it is supposed to stop doubling the delay after the 10th time; otherwise, the delays would get unreasonably large. Of these two constants, "16" is sort of an arbitrary choice, "10" is not! This is because the maximum number of hosts on an Ethernet is set to be 1024 = 2^10; truncating the delay at less than 2^10 slot times could (if someone was stupid enough to build a net that large) conceivably cause congestive collapse. Truncating the delay at more than 2^10 slot times, on the other hand, doesn't buy you anything. The reason why one is supposed to give up after 16 attempts is that after a while it is pointless to continue. Moreover, in order to maintain both fairness and reasonable performance, it is important to limit the time over which "history" of previous network behaviour is retained. That is why one starts the next transmission attempt with a delay-count of 0, rather than trying to be clever and use something based on the delay you used for the previous packet. Note that there are some pathological situations where, with small numbers of very busy hosts, the Ethernet can be measurably unfair over rather long time scales. In general, though, the Ethernet works right. So: obey the law, it's for your own good. This law, at any rate. -Jeff