Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.3 4.3bsd-beta 6/6/85; site ucbvax.ARPA Path: utzoo!watmath!clyde!burl!ulysses!ucbvax!HEDRICK@RED.RUTGERS.EDU From: HEDRICK@RED.RUTGERS.EDU (Charles Hedrick) Newsgroups: fa.tcp-ip Subject: interesting loop Message-ID: <8508300028.AA05209@UCB-VAX.ARPA> Date: Thu, 29-Aug-85 18:52:08 EDT Article-I.D.: UCB-VAX.8508300028.AA05209 Posted: Thu Aug 29 18:52:08 1985 Date-Received: Sat, 31-Aug-85 05:48:50 EDT Sender: daemon@ucbvax.ARPA Reply-To: tcp-ip@ucb-vax.berkeley.edu Organization: The ARPA Internet Lines: 99 We just got our IP network into a loop. It's clearly my fault, but the problem is subtle enough that I thought it might be useful for me to point it out to others. It is important to understand our network configuration. We have two Ethernets, 128.6.3 and 128.6.4. For readability, I am going to drop the 128.6 and refer to them as networks 3 and 4. They are connected by two gateways. One of them is a Pyramid 90x (4.2 Unix). It is configured to act as a normal IP gateway. Because not all of our systems know about subnetting, it also does the "ARP hack". Suppose host 128.6.4.2 (which I will refer to as 4.2, since I am dropping 128.6) wants to talk to host 3.1. It will send out an ARP: from 4.2, broadcast, who is 3.1? The gateway will see this ARP request. It will check its tables, realize that the sender needs gatewaying to talk to the target, and that it is prepared to act as that gateway. So the gateway will respond from gateway, to 4.2, 3.1 is This is a lie, since it is claiming that its own Ethernet address is 3.1's. But the lie is useful, since it will cause 4.2 to send its packets to the gateway, which will deliver them. The second gateway is an Applitek Ethernet bridge. This is a broadband cable system onto which you can put Ethernet bridges. They are not IP gateways. Instead, they are "transparent" Ethernet-level gateways. Each bridge runs in promiscuous mode, looking at every packet on its Ethernet. When it sees any packet addressed to someone on a different Ethernet, it sends the packet over the broadband to the bridge on the appropriate Ethernet. It makes no changes to the packet. This is totally invisible to the hosts. The hosts think we just have one big Ethernet. Now, consider what happens when 4.2 wants to talk to 3.1. It will send out an ARP from 4.2, broadcast, who is 3.1? The bridge on network 4 will pass this to the bridge on 3, which will then broadcast it. 3.1 will see it, and respond from 3.1 to 4.2, 3.1 is The bridge will pass this packet back to network 4, whose bridge will send it to 4.2. The connection is now set up, and all of the packets will follow this same path. Now for the fun. Consider what will happen when a host wants to talk to another host on the same network: from 4.2, broadcast, who is 4.3? First, 4.3 will see this, and respond from 4.3 to 4.2, 4.3 is However the Applitek bridge will pick up the original broadcast and repeat it on all of the other subnets. (Since the bridges are protocol-independent, they know nothing about Internet addresses. They have no way to know that the ARP will be satisfied locally. Thus they forward all broadcasts to all subnets. This is moderately reasonable.) In particular, it will repeat it on network 3. Our Pyramid gateway will see this request. Since it is an ARP on network 3 looking for a host on network 4, the Pyramid will offer to gateway: from gateway to 4.2, 4.3 is The Applitek bridge will pick this on on network 3 and pass it back to 4, where it gets sent back to 4.2. If 4.2 is a Unix system, it will believe the last ARP reply that it sees. So we now have an ARP entry: 4.3 gateway's address on network 3 The funny thing is, this may work. Packets destined for 4.3 will be sent to the gateway's other side. The Applitek bridge will see an Ethernet address on the other subnet, and forward it. The gateway will get the packet, and forward it back to network 4, where it will presumably be delivered to the correct place. However the gateway itself is not immune from this problem (at least not with the code I have in it now). When the gateway attempts to talk to 4.3, the ARP packet will again be forwarded to the other subnet, and the gateway itself will respond. Thus the gateway will end up with an ARP table entry containing 4.3 gateway's own address on network 3 At this point we have a loop. Obviously the simplest answer is that as long as the Applitek system is working, we turn off the Pyramid gateway code. However we would like that code to be available as a backup should the Applitek system go down. It begins to appear that we are going to have to check specifically for this sort of situation. However in a complex network topology, it may not be entirely clear who one should and should not be willing to gateway for. It is possible that the real moral is that "transparent" gateways and IP gateways may have trouble coexisting. Let my hasten to point out that both the subnetting implementation and the "ARP hacking" code on the Pyramid are mine. They should not be blamed on either Berkeley or Pyramid. -------