Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.3 4.3bsd-beta 6/6/85; site ucbvax.ARPA
Path: utzoo!watmath!clyde!burl!ulysses!ucbvax!HEDRICK@RED.RUTGERS.EDU
From: HEDRICK@RED.RUTGERS.EDU (Charles Hedrick)
Newsgroups: fa.tcp-ip
Subject: interesting loop
Message-ID: <8508300028.AA05209@UCB-VAX.ARPA>
Date: Thu, 29-Aug-85 18:52:08 EDT
Article-I.D.: UCB-VAX.8508300028.AA05209
Posted: Thu Aug 29 18:52:08 1985
Date-Received: Sat, 31-Aug-85 05:48:50 EDT
Sender: daemon@ucbvax.ARPA
Reply-To: tcp-ip@ucb-vax.berkeley.edu
Organization: The ARPA Internet
Lines: 99

We just got our IP network into a loop.  It's clearly my fault, but the
problem is subtle enough that I thought it might be useful for me to
point it out to others.  It is important to understand our network
configuration.  We have two Ethernets, 128.6.3 and 128.6.4.  For
readability, I am going to drop the 128.6 and refer to them as
networks 3 and 4.  They are connected by two gateways.  One of them is a
Pyramid 90x (4.2 Unix).  It is configured to act as a normal IP gateway.
Because not all of our systems know about subnetting, it also does the
"ARP hack".  Suppose host 128.6.4.2 (which I will refer to as
4.2, since I am dropping 128.6) wants to talk to host 3.1. It will send
out an ARP:

  from 4.2, broadcast, who is 3.1?

The gateway will see this ARP request.  It will check its tables,
realize that the sender needs gatewaying to talk to the target, and that
it is prepared to act as that gateway.  So the gateway will respond

  from gateway, to 4.2, 3.1 is <gateway's Ethernet address>

This is a lie, since it is claiming that its own Ethernet address is
3.1's.  But the lie is useful, since it will cause 4.2 to send its
packets to the gateway, which will deliver them.

The second gateway is an Applitek Ethernet bridge.  This is a broadband
cable system onto which you can put Ethernet bridges.  They are not
IP gateways.  Instead, they are "transparent" Ethernet-level gateways.
Each bridge runs in promiscuous mode, looking at every packet on its
Ethernet.  When it sees any packet addressed to someone on a different
Ethernet, it sends the packet over the broadband to the bridge on
the appropriate Ethernet.  It makes no changes to the packet.  This
is totally invisible to the hosts.  The hosts think we just have one
big Ethernet.  Now, consider what happens when 4.2 wants to talk to
3.1.  It will send out an ARP

  from 4.2, broadcast, who is 3.1?

The bridge on network 4 will pass this to the bridge on 3, which will
then broadcast it.  3.1 will see it, and respond

  from 3.1 to 4.2, 3.1 is <its Ethernet address>

The bridge will pass this packet back to network 4, whose bridge will
send it to 4.2.  The connection is now set up, and all of the packets
will follow this same path.

Now for the fun.  Consider what will happen when a host wants to
talk to another host on the same network:

  from 4.2, broadcast, who is 4.3?

First, 4.3 will see this, and respond

  from 4.3 to 4.2, 4.3 is <its Ethernet address>

However the Applitek bridge will pick up the original broadcast and
repeat it on all of the other subnets.  (Since the bridges are
protocol-independent, they know nothing about Internet addresses. They
have no way to know that the ARP will be satisfied locally. Thus they
forward all broadcasts to all subnets.  This is moderately reasonable.)
In particular, it will repeat it on network 3.  Our Pyramid gateway will
see this request.  Since it is an ARP on network 3 looking for a host on
network 4, the Pyramid will offer to gateway:

  from gateway to 4.2, 4.3 is <gateway's Ethernet address on network 3>

The Applitek bridge will pick this on on network 3 and pass it back to
4, where it gets sent back to 4.2.  If 4.2 is a Unix system, it will
believe the last ARP reply that it sees.  So we now have an ARP entry:

   4.3  gateway's address on network 3

The funny thing is, this may work.  Packets destined for 4.3 will be
sent to the gateway's other side.  The Applitek bridge will see an
Ethernet address on the other subnet, and forward it. The gateway will
get the packet, and forward it back to network 4, where it will
presumably be delivered to the correct place. However the gateway itself
is not immune from this problem (at least not with the code I have in it
now).  When the gateway attempts to talk to 4.3, the ARP packet will
again be forwarded to the other subnet, and the gateway itself will
respond.  Thus the gateway will end up with an ARP table entry
containing
  4.3 gateway's own address on network 3
At this point we have a loop.

Obviously the simplest answer is that as long as the Applitek system
is working, we turn off the Pyramid gateway code.  However we would
like that code to be available as a backup should the Applitek system
go down.  It begins to appear that we are going to have to check
specifically for this sort of situation.  However in a complex network
topology, it may not be entirely clear who one should and should not
be willing to gateway for.  It is possible that the real moral is
that "transparent" gateways and IP gateways may have trouble
coexisting.

Let my hasten to point out that both the subnetting implementation
and the "ARP hacking" code on the Pyramid are mine.  They should not
be blamed on either Berkeley or Pyramid.
-------