Path: utzoo!attcan!uunet!lll-winken!lll-tis!helios.ee.lbl.gov!pasteur!agate!ucbvax!VIOLET.BERKELEY.EDU!dlw From: dlw@VIOLET.BERKELEY.EDU (David Wasley) Newsgroups: comp.sys.proteon Subject: routing problem Message-ID: <8808022210.AA22473@violet.berkeley.edu> Date: 2 Aug 88 22:10:23 GMT Sender: daemon@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 107 Having puzzled about a routing problem here for some time, without resolution, I've just heard that another site is experiencing the same problem. So let me pose it for this group in case anyone else is seeing it too, or can shed light. Below is a schematic similar to our situation: net-A net-B net-C net-D ethernets | | | | +-------+ +-------+ +-------+ +-------+ | p4200 | | p4200 | | p4200 | | p4200 | | GW1 | | GW2 | | GW3 | | GW4 | +-------+ +-------+ +-------+ +-------+ \ / \ / \ / \ / pp-EA\ /pp-EB pp-EC\ /pp-ED pt-to-pt circuits \ / \ / \ / \ / +-------+ +-------+ | p4200 | | p4200 | | GW5 | | GW6 | +-------+ +-------+ | | ethernet net-E ------o------------------------------o----------------------- All GW's are running release 7.4b and all use RIP. I have a process running on a machine within net-E that sends a RIP "query" to GW5 every 30 seconds, and notes any change in advertised metrics. (No, we don't have the ability to do it with SNMP (yet) nor is the monitor machine on the same ethernet. We're working on that.) A similar process monitors GW6. The problem: as many as 15 times a day, the metrics for pp-EC, net-C, net-D, and pp-ED **as seen by GW5** go to infinity, and then come back anywhere from 30 seconds (the next sample) to 30 minutes later. However, during those same days, GW6 **never** loses routes to those nets, but it does lose routes to the things beyond GW5. In other words, the problem shows symmetry. (The exact times and frequencies vary between the 2 GW's, but the symptoms are symmetrical.) Below are extracts from the actual log file for GW5. (Only the net names have been changed, to correspond to the picture.) Has anyone else observed this behavior? Can anyone think of a plausible explanation? Thanks, David Wasley U C Berkeley ---- Scenario 1: lose all routes synchronously, very common --- Jul 28 13:31:32 pp-EC 2 -> 16 Jul 28 13:31:32 pp-ED 2 -> 16 Jul 28 13:31:32 net-D 3 -> 16 Jul 28 13:31:32 net-C 3 -> 16 Jul 28 13:32:04 pp-EC 16 -> 2 Jul 28 13:32:04 pp-ED 16 -> 2 Jul 28 13:32:04 net-D 16 -> 3 Jul 28 13:32:04 net-C 16 -> 3 Jul 28 15:58:35 pp-EC 2 -> 16 Jul 28 15:58:35 pp-ED 2 -> 16 Jul 28 15:58:35 net-D 3 -> 16 Jul 28 15:58:35 net-C 3 -> 16 Jul 28 15:59:08 pp-EC 16 -> 2 Jul 28 15:59:08 pp-ED 16 -> 2 Jul 28 15:59:08 net-D 16 -> 3 Jul 28 15:59:08 net-C 16 -> 3 ---- Scenario 2: lose nets, then p-p links, then regain in reverse order ---- Jul 29 13:48:22 net-D 3 -> 16 Jul 29 13:48:22 net-C 3 -> 16 Jul 29 13:49:28 pp-EC 2 -> 16 Jul 29 13:49:28 pp-ED 2 -> 16 Jul 29 13:50:33 pp-EC 16 -> 2 Jul 29 13:50:33 pp-ED 16 -> 2 Jul 29 13:54:53 net-D 16 -> 3 Jul 29 13:54:53 net-C 16 -> 3 ---- Scenario 3: lose nets and p-p links asynchronously!?! ---- Jul 30 04:05:59 net-D 3 -> 16 Jul 30 04:05:59 net-C 3 -> 16 Jul 30 04:07:04 net-D 16 -> 3 Jul 30 04:07:04 net-C 16 -> 3 Jul 30 07:50:03 pp-ED 2 -> 16 Jul 30 07:50:03 pp-EC 2 -> 16 Jul 30 07:57:37 pp-EC 16 -> 2 Jul 30 07:57:37 pp-ED 16 -> 2 Jul 30 08:00:52 net-D 3 -> 16 Jul 30 08:00:52 net-C 3 -> 16 Jul 30 08:12:08 net-D 16 -> 3 Jul 30 08:12:08 net-C 16 -> 3