Path: utzoo!utgpu!watserv1!watmath!att!tut.cis.ohio-state.edu!cs.utexas.edu!swrinde!elroy.jpl.nasa.gov!ames!sgi!rpw3@rigden.wpd.sgi.com From: rpw3@rigden.wpd.sgi.com (Rob Warnock) Newsgroups: comp.protocols.tcp-ip Subject: Re: More on TCP Performance Limits Message-ID: <81033@sgi.sgi.com> Date: 14 Jan 91 03:45:26 GMT References: <9101111221.AA08912@garuda.sics.se> Sender: guest@sgi.sgi.com Reply-To: rpw3@sgi.com (Rob Warnock) Organization: Silicon Graphics, Inc., Mountain View, CA Lines: 74 In article <9101111221.AA08912@garuda.sics.se> craig@SICS.SE (Craig Partridge) writes: +--------------- | There seems to be a lot of misinformation running around. | The end-to-end performance of a TCP connection is limited by two different | factors: | (1) The window size... | (2) The sequence space size... +--------------- And (at least) one more: (3) The underlying IP ID space size. Item #1 depends on the round-trip time, #2 and #3 do not. As Chris Johnson noted, you can only send <#_of_distinct_IP_IDs> (64K) times (64K) bytes per TTL, where the TTL has to at least be large enough to cover your number_of_hops, and in any case shouldn't be smaller than 15 since that's the suggested default reassembly timeout. With TTL=255, that's 16.8 MB/s; for TTL=15, that's 286 MB/s. Why be concerned about reassembly timeouts? Because to get the data rates noted above, you have to send max-sized IP packets (64 Kbytes), which implies fragmentation on almost all current media (except HiPPI). (Also means TCP MSS = 64K, but that's the least of the worries.) And you don't want later fragments being confused with earlier ones. If your IP holds onto frags for a minimum of 15 seconds (see RFC 791, Page 27), that puts an effective minimum on TTL of 15 seconds, at least for the purposes of the rate-limit calculation. But RFC 1122 says [page 35]: A fixed value [of TTL] must be at least big enough for the Internet "diameter," i.e., the longest possible path. A reasonable value is about twice the diameter, to allow for continued Internet growth. And further [page 57]: There MUST be a reassembly timeout. The reassembly timeout value SHOULD be a fixed value, not set from the remaining TTL. It is recommended that the value lie between 60 seconds and 120 seconds... DISCUSSION: The IP specification says that the reassembly timeout should be the remaining TTL from the IP header, but this does not work well because gateways generally treat TTL as a simple hop count rather than an elapsed time. If the reassembly timeout is too small, datagrams will be discarded unnecessarily, and communication may fail. The timeout needs to be at least as large as the typical maximum delay across the Internet. A realistic minimum reassembly timeout would be 60 seconds. Using the suggested 60 seconds produces a IP ID re-use rate-limit of 71.6 MB/s, 120 seconds gives 35.8 MB/s. So the IP ID rate-limit (item #3) is also a serious issue in gigabit/sec TCP networking. Some of the ideas in RFC 1185 may be helpful here, but in the presence of fragmentation, TCP options cannot be recognized in any fragment except the first. The solution may be to use some form of MTU discovery, then send *all* TCP segments with the "Don't Frag" bit on the in the IP packets (avoiding reassembly aliasing), *let* the IP IDs wrap as they will, and use the timestamp mechanisms of RFC 1185 to sort out potential duplicates. -Rob ----- Rob Warnock, MS-9U/515 rpw3@sgi.com rpw3@pei.com Silicon Graphics, Inc. (415)335-1673 Protocol Engines, Inc. 2011 N. Shoreline Blvd. Mountain View, CA 94039-7311