Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!hao!ames!ucbcad!ucbvax!UDEL.EDU!Mills From: Mills@UDEL.EDU Newsgroups: comp.protocols.tcp-ip Subject: Re: routing changes Message-ID: <8711052232.aa14448@Huey.UDEL.EDU> Date: Thu, 5-Nov-87 22:32:05 EST Article-I.D.: Huey.8711052232.aa14448 Posted: Thu Nov 5 22:32:05 1987 Date-Received: Fri, 13-Nov-87 00:10:39 EST Sender: daemon@ucbvax.BERKELEY.EDU Organization: The ARPA Internet Lines: 42 Sergio, Ah yes, the infamous 192.31.x nets. These dudes have been bouncing all over t the map for some time now. The distance values for these nets are not provided by the fuzzballs, but by gated at some site or other. WHen they count to infinity they have in fact become unreachable. This is a classic example of what unstable metrics can do to a distributed Bellman-Ford algorithm. I have been working feversihly to harden the algorithm so that even these wild swings won't destabilize the algorithm, but when distances change from one sample to the next by over fifty percent, what can any algorithm do? I repeat my statement made at least a dozen times: where is the source of those violent delay excursions and what gated is generating them? Having said that, note that even these severe transients should not adversely affect the system throughput, at least for the nets not rocking to and fro, since the hello messages are rate-limited. On the other hand, traffic for nets counting to infinity can clearly gobble up dangerous levels of traffic. That's why I have been spending so much time trying to avoid the counting problem. THe only way to do that is to latch sudden increases in delay and prevent further decreases until the hold-down timer expires, which is what the present system does. I have had to experiment somewhat in order to gauge the sensitivity of the latch, which is presently set at a factor of two. The latch regularily snares at least some of the surges, but not all, as you can see from your data. I can't make the latch more sensitive without snaring a lot of benign wobbles, such as occasional retransmissions on UIUC - NCAR lines, for example. Nevertheless, I have tuned the algorithm a lot in the past month and, at least in the testing swamps, it seems to be working well. It has been suggested that JVNC has more trouble than most because that is the only spot running gated on two machines on the same Ether. I thought Maryland was doing that as well. While they seem to be having trouble of their own, destabilized routes do not seem to be a serious problem there. There are two things I would recommend (again): first, identify all those gated configurations where only a single path is available to the networks being squawked and set the squawked delay to zero, just plain zero. Second, where multiple paths to a net exist, pray to the metric-translation god and really, truly and verily conform to the rules I suggested in my earlier memo. In any case, the clock-offset fields associated with each net in the hello message should be set to zero and the date in the header should be marked invalid. This seems like a pretty simple thing to check. Dave