Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!linus!philabs!cmcl2!seismo!brl-smoke!smoke!PKARP@SRI-IU.ARPA
From: PKARP@SRI-IU.ARPA (Peter Karp)
Newsgroups: net.mail.headers
Subject: Mail looping
Message-ID: <1231@brl-smoke.ARPA>
Date: Sat, 22-Feb-86 14:34:08 EST
Article-I.D.: brl-smok.1231
Posted: Sat Feb 22 14:34:08 1986
Date-Received: Mon, 24-Feb-86 08:14:07 EST
Sender: news@brl-smoke.ARPA
Lines: 61

I believe I have at least a good theoretical understanding of how
to prevent mail loops.  In the previous messages on this topic it
hasn't been clear to me how one could theoretically prevent mail
loops, or if this is even possible.  I believe I do know how to do
it in theory; if you all buy this argument then we can talk about
the implementation later.

I apologize if this is obvious to everyone; it wasn't obvious to me and on
re-consideration of the messages I've seen it still doesn't appear obvious.

Consider the following example.  A message originates on Host-A, and is
set to a mailing list called "LIST@Host-B".  One element of LIST on Host-B
is the address SUB-LIST-1@Host-C.  An element of SUB-LIST-1 on Host-C is
SUB-LIST-2@Host-B, from which it gets distributed to various individuals.
Let us postulate that the original message was also set to "USER@Host-B",
and that  "USER@Host-B" is also a member of SUB-LIST-1. Thus,
"USER@Host-B" should receive two copies of the message: one direct from
Host-A, and one with return path: @Host-C,@Host-B:Originator@Host-A.

Notice that the "same message" gets routed through Host-B several times,
and that it would be incorrect for Host-B to think it has detected a loop
simply based on the Message-ID created by the message originator (this has
been pointed out before).  Also note that the "same message" gets sent to
the same recipient several times (USER@Host-B), and it would also be
incorrect for Host-B to suppress the duplicate simply because it sees two
messages with the same Message-ID going to the same recipient.  Both of
these conditions look like loops but are not.


So, what's the solution?  Consider an abstraction.  Imagine that a mail
message is simply a packet getting switched between the nodes of a
network. Mail packets are special in that any one packet can be duplicated
into several other packets at other nodes as mailing lists are expanded.
These child packets then go their own way in the network.  There are two
strcutures of interest here.  One is the path a given packet follows through
the network. The other is the mailing-list-based packet-synthesis tree
which shows how one initial message packet gets duplicated into a whole
swarm of child packets which eventually get either dropped on the floor or
land in someone's mailbox.

A loop condition occurs when a packet P with the following properties has
arrived at a network node:
	a) either that same packet or one of its ancestors in the packet-
	synthesis tree, P',  has been to that node before.

	b) Packet P and packet P' were both addressed to the same
	recipient.

How can a host detect a loop condition?  Simple: when it relays a packet
it puts a mark on that packet which it will recognize if that packet or
any of its descendants ever arrives at that host again.  It also must record
what recipients the packet was destined for, and check all incoming
packets to determine if it has seen them or an ancestor of theirs before,
addressed to the same recipient(s).


So again, if this looks right we can start worrying about implementation.


Peter
-------