Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!usc!elroy.jpl.nasa.gov!decwrl!sgi!cjohnson@somni.wpd.sgi.com
From: cjohnson@somni.wpd.sgi.com (Chris Johnson)
Newsgroups: comp.protocols.tcp-ip
Subject: Re: IP Bandwidth limits (Closing? salvo)
Summary: Facts is Facts - IP is a bottleneck
Message-ID: <81548@sgi.sgi.com>
Date: 17 Jan 91 04:12:36 GMT
References: <9101151425.AA03304@pecan29.cray.com>
Sender: guest@sgi.sgi.com
Reply-To: cjohnson@pei.com (Chris Johnson)
Organization: Silicon Graphics, Inc., Mountain View, CA
Lines: 115

It is always interesting to see the level of response two sentences
can generate.  I got a mailbox full from many Internet mavens and
maven wannabes.

I said (paraphrased slightly): 
	The data rate limit for TCP/IP isn't window size dependent.

The responses to this generally said "there is a window size bandwidth
limit", which in turn caused a flurry of "no there isn't a limit" notes.

I should have said:

	There are performance bottlenecks in TCP as specified in
	rfc793, but TCP options have been developed that address
	these problems.  As rfc1072 and rfc1185 point out, data
	rates over certain thresholds *require* extensions to
	rfc793.  To this extent, interoperability among vendors at
	data rates beyond the FDDI range is only possible if these
	TCP "options" become requirements, or if other mechanisms
	are developed and standardized that address the problems
	discussed in rfc1072 and  rfc1185.

	Note that these options are not mentioned in the host
	requirement doc, so as of today TCP has bandwidth limits
	in the FDDI range.  Certain implementations exceed these
	limits using the optional extensions to the base protocol.

Of course this will bother the mavens, too, but it is accurate.

I also said:
	The sixteen bit IP id field and the 16 bit max packet
	length limit a particular connection to 4GB/255 seconds
	or about 16MB/sec.

The responses to this were more entertaining (all paraphrased):

	Some suggested data to the contrary:
		> What about Famous Person at Acme Data Co who got
		> umpteen gigaunits per picoblip?

The point here is that ip ids in svr4 and BSD come from
	ip->ip_id = htons(ip_id++);
which is incorrect according to the IP spec.  So the high data rates
probably come from non-conforming IP implementations.  Perhaps the
designers decided that a subset of IP was all that people needed,
but that design decision never seems to get mentioned.  As data rates
go toward terabytes/sec, the above bug will get more severe.

	Some argued my choice of parameters:
		> No one uses a ttl of 255

Of course, using a smaller ttl will move the bottleneck, but that
has its limit.  Also, in reality very few media support 64k packets,
so the "real world" cases modify both the numerator and denominator
of the ratio.  Do your own calculations for your favorite numbers.
FDDI (4KB packets)  with a ttl of 30 yields a maximum of 8.9MB/sec.
Even less if you can't fill every packet to the brim.

	Higher level can detect it:
		> So what?  Mis-reassembled IP packets will be
		> detected by checksum failure, window range tests
		> or rfc1185 sequence wrap check.

The suggestion that TCP will detect the problem relaxes the IP
specification from general purpose routing and fragmentation, to
routing and fragmentation in the presence of higher layer fragment
assembly validation checks.  Of course IP implementations are
still broken, just not too broken for TCP.

	TCP should do routing:
		> TCP should use path mtu discovery and then
		> fragging is irrelevant.

This says that IP isn't broken because TCP should know about
routing/fragmentation tasks.  This is an appealing argument because
it addresses the actual flaw (that IP fragging is brain dead) but it
violates layering in a particularly violent manner.  And again, it
ignores the fact that other layers may be using IP's services.

	And the last catagory, the INET Jihad:
		> How dare you complain about items designed by
		> your betters.
		>
		> Don't you realize that tcp-ip is fighting the
		> forces of darkness and these petty complaints
		> just help OSI.
		>
		> These issues shouldn't be discussed in public
		> forums because naive users get confused.

The religous arguments were the most entertaining, but had the
least content.  

Here's what I should have said in my first message:
	There is another bottleneck at the IP layer that is
	unresolved as yet.  The spec in rfc791 *requires* that
	the (IPid, protocol, src host, dest host) quadruple is
	unique for a MPL.  At this time, most reference (SVR4 and
	BSD-reno) IP implementations DO NOT enforce this
	restriction, which may result in data corruption at the IP
	layer in rare cases at sufficiently high data rates.
	Some of these errors may be detected at the transport
	layer, but senarios can be defined in which application
	layers will receive stale or mangled data.

Finally, I think this discussion is overkill on an issue that
falls out on simple math from the protocol definition.  But
for some reason a single sentence was inadequate, so here is
a longer analysis.  As the summary says, facts is facts, so IP
has a data limit even if your implementation doesn't.  In
particular, if your IP implementation doesn't have a rate
limit, then it isn't 100% compliant with rfc791.

Keep those cards and letters coming in,
					cj*