Path: utzoo!utgpu!attcan!uunet!husc6!mailrus!ames!pasteur!ucbvax!VIOLET.BERKELEY.EDU!dlw
From: dlw@VIOLET.BERKELEY.EDU (David Wasley)
Newsgroups: comp.sys.proteon
Subject: Re: routing problem
Message-ID: <8808050159.AA28863@violet.berkeley.edu>
Date: 5 Aug 88 01:59:10 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Organization: The Internet
Lines: 76

Re:
	From swb@dainichi.tn.cornell.edu Thu Aug  4 14:22:57 1988
	To: dlw@violet.berkeley.edu (David Wasley)
	Cc: p4200@devvax.TN.CORNELL.EDU, cliff@cmsa.berkeley.edu,
	        vaf@score.stanford.edu, swb@dainichi.tn.cornell.edu
	Subject: Re: routing problem 
	Date: Thu, 04 Aug 88 17:21:26 -0400
	From: Scott Brim <swb@dainichi.tn.cornell.edu>
	
	Dave, is there anything else involved in the routing?  Backdoor
	connections not shown on your map?  Something translating between
	protocols?  EGP peers which are not really a homogeneous group?
	
	Do the metrics instantly pop to infinity or do they count to it?
	What filtering do you have on your interfaces (any?) to avoid
	routing "echoes"?

There are no back doors between the remote nets. The only routing protocol
used is RIP. The metrics (seem to) go instantly to infinity, as shown in
the log file output I sent. (The numbers are the actual metrics seen.)
I think this is reasonable assuming poisoned reverse.

There is one thing I didn't mention because I didn't think it relevant :-)
We're using 2 IP addresses on the same ethernet network controller on several
p4200's.  This is to implement different controls on routing information
interchange while avoiding an extra hop. Below is the more complete picture:

   (To NSFNET)
      net-A         net-B            net-C         net-D      ethernets
        |             |                |             |
    +-------+     +-------+        +-------+     +-------+
    | p4200 |     | p4200 |        | p4200 |     | p4200 |
    |  GW1  |     |  GW2  |        |  GW3  |     |  GW4  |
    +-------+     +-------+        +-------+     +-------+
        \             /                \             /
         \           /                  \           /
     pp-EA\         /pp-EB          pp-EC\         /pp-ED     pt-to-pt circuits
           \       /                      \       /
            \     /                        \     /
           +-------+                      +-------+
           | p4200 |                      | p4200 |
           |  GW5  |                      |  GW6  |
           +-------+                      +-------+
               |\                             |\
   net-E ------o-\----------------------------o-\---------------------
                  \                              \    one ethernet, 2 IP nets
   net-F ----------o--------o---------------o-----o-------------------
                            |               |
                        +-------+       +-------+
                        | p4200 |       | p4200 |
                        |  GW7  |       |  GW8  |
                        +-------+       +-------+
                          |   |           |   |
                       (To other subnets of net-F)

I was monitoring the net-F interface on box GW5, and seeing the 4 nets
beyond GW6 disappear, and then come back. There are no other nets beyond GW6.

One theory is that the routes were flopping between the net-E interface
and the net-F interface. This shouldn't happen because the metrics would
be identical, but it may be. (We ran an ethernet level trace this morning
but haven't analyzed it yet.) The non-16 metrics as seen by a query on
the net-F interface would indicate GW5 saw a route via its net-E interface;
metric 16's would indicate that the route was (then) via the net-F interface.

If it is flopping, I would have expected to see a more even distribution
of 16 & non-16 metrics.  On the other hand, maybe this is yet another example
of Van Jacobson's lock-step phenomenon.  But in that case, I would expect
them *always* to change together, which they don't.

To test this by inference, I change the configurations to "send nets" on
only the net-E interface (and allowed GW7 & 8 to listen only on that net).
It has been much more stable since then. I'll believe it after a day or so.

Maybe I'm just tilting at windmills...
	David