Path: utzoo!utgpu!water!watmath!clyde!rutgers!ames!pasteur!ucbvax!PURDUE.EDU!narten
From: narten@PURDUE.EDU (Thomas Narten)
Newsgroups: comp.protocols.tcp-ip
Subject: Re: PSN 7 End-to-End question.
Message-ID: <8801312221.AA01892@percival.cs.purdue.edu>
Date: 31 Jan 88 22:21:14 GMT
References: <8801291809.AA01914@Pescadero>
Sender: daemon@ucbvax.BERKELEY.EDU
Organization: The Internet
Lines: 133

> Presumably, the EE and Imp-to-imp protocols also consume the INTERNAL
> resources of the network while they are doing this managing.  Is there
> any evidence to assure us that these protocols are a net performance
> win over a simple, lean and mean best-efforts datagram service, which is
> all that IP/TCP wants and can use?

Total "best effort" systems work well only as long as the switches and
communications lines run below maximum capacity. Once maximum capacity
is reached or exceeded, problems arise. Many of them are solvable, but
they must be addressed, and the resulting system may no longer be "lean
and mean".

1) Best effort systems rely totally on hosts for congestion
management. That is, transport protocols are responsible for
congestion control and congestion avoidance.

In practice, existing protocols don't play by those rules. For
instance, only recently (Van Jacobson's work) have TCP implementations
started reacting (in a positive way) to congestion.  UDP based
protocols implement no congestion control at all. I cringe at the
thought of running NFS across the Internet.

2) Without TOS priority, all datagrams are considered equal. Routing
protocols suffer just as much from congestion as other protocols, but
consequences are much more severe.  Nowhere in the Internet (that I
know of) are the datagrams that are used for the exchange of routing
information given precedence over others.

3) As John Nagle describes, congestion collapse is inevitable in
situations where transport protocols send more packets into the
network in response to congestion. Furthermore, I know of no
point-to-point networks that are designed to run steady-state at or
above maximum capacity. They all assume that the network will be
lightly loaded. Some networks (e.g. the ARPANET) take steps that
guarantee this.

4) In order for best effort systems to work well, transport protocols
must practice congestion avoidance. Congestion control deals with
congestion once it exists, congestion avoidance is aimed at keeping
congestion from ever reaching the point where the congestion control
mechanism kicks in. Congestion avoidance aims to run the network at
maximum throughput *AND* minimum delay.

Consider TCP: its window tends to open as far as it can (8 packets at
1/2K each). The network is forced to buffer the entire window of
packets.  If the two endpoints are separated by a slow speed link,
most of the packets will be buffered there. Congestion and delay
increase.  Congestion could be reduced without reducing throughput by
decreasing the size of the window. In reality, TCP won't do anything
unless a packet is dropped, or a source quench is received. A new
mechanism is needed to distinguish between high delays due to
congestion from those due to the transmission media (e.g. satellite
vs. terrestrial links).

Raj Jain's work deserves close study in this regard.

5) In the present Internet, congestion avoidance is a dream. That
pushes congestion control/avoidance into the gateways and physical
networks. In many systems, congestion control simply resorts to
dropping the packet that just arrived that can't be put anywhere. Some
of the issues:

a. Fairness: He who transmits the most packets, gets the most
resources. This discourages well-tuned protocols, and encourages
antisocial behavior.

b. "Fair queuing" schemes limit the resources that a particular class
of packets can allocate. For instance, Dave Mills' selective
preemption scheme limits buffer space according to source IP
addresses.

There are fairness issues here too. All connections and protocols from
the same source are lumped into the same class. Does the "right" thing
happen when a TCP connection competes with a NETBLT connection?

c. Queuing strategies increase rather than decrease the per-packet
overhead. Furthermore, the information used to group datagrams into
classes must be readily available. In the worst case, you have to be
able to parse higher layer packet headers.

d. These queuing strategies rely entirely on local rather than global
information. It may be that 90% rather than 20% of the packets should
be discarded; a link two hops away might be even more congested. 

6) Because of (1) above, physical network designers should give
considerable thought to congestion control/avoidance.

The ARPANET practices congestion avoidance. That is one of the biggest
reasons that one of the oldest networks, based on "old" and "obsolete"
technology still works extremely well in today's environment.  People
should be much more careful in distinguishing between the ARPANET and
the Internet.  For instance, "the ARPANET is congested", usually
really means that the gateways connected to it are congested, or the
Internet routing mechanism has broken down.

My understanding of ARPANET internals is as follows: Before packets
can be sent, an end-to-end VC is opened to to the destination IMP. I
call this type of VC "weak", because it provides VC services, but is
actually implemented as a sliding window protocol (roughly) similar to
TCP.  "Strong" VCs refer to those in which buffers and routes are
preallocated, and packet switches contain state information about the
circuits that pass through them.  Inside the ARPANET, IP packets are
fragmented into small 200+ byte datagrams that are sent through the
network using best effort delivery. The destination IMP reassembles
them and sends back an acknowledgement that advances the window.

The number of packets in the network at any given time for any given
src/dest IMP pair is limited. This essentially limits the total number
of packets in the network at any one time, resulting in one form of
congestion avoidance. Presumably the window size (8 IP packets) has
been chosen based on extensive engineering considerations.

This scheme also raises the same fairness issues described above. For
instance, should (or shouldn't) the gateways at the NSFnet/ARPANET
gateways be able to get more resources than site X?

Of course, total best effort systems have advantages over other
schemes. One is their relative simplicity, and the loose coupling
among gateways and packet switches. Another is the ability of one user
to grab a large percentage of all available network resources.
Although considered a disadvantage if the user is a broken TCP
implementation, it is necessary if a user is to expect good
performance running a well tuned bulk transfer protocol (e.g. NETBLT).

>   What is the best reference to understand how these protocols manage the
> network resources, particularly in dealing with network congestion?
> Thanks,
> David Cheriton

I too am interested in further references, especially those relating
to best effort systems.

Thomas Narten