Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site petrus.UUCP Path: utzoo!watmath!clyde!burl!ulysses!gamma!epsilon!zeta!sabre!petrus!karn From: karn@petrus.UUCP (Phil R. Karn) Newsgroups: net.dcom Subject: Re: Re: Re: Standards for commercial pac Message-ID: <513@petrus.UUCP> Date: Thu, 29-Aug-85 20:06:42 EDT Article-I.D.: petrus.513 Posted: Thu Aug 29 20:06:42 1985 Date-Received: Sat, 31-Aug-85 06:24:44 EDT References: <678@wdl1.UUCP> Organization: Bell Communications Research, Inc Lines: 145 > Datagram systems have some serious problems. Here are a few of them. > > 1. In a pure datagram system, with no link-level retransmission, the > probability of successfully forwarding a packet througn N nodes > declines exponentially with the number of nodes. Ham users of > digipeaters and UNIX users of async links for IP datagrams are > painfully aware of this phenomenon. You really do need link-level > retransmission in any sizable datagram system, unless the medium > has very low error rates. Portions of a datagram network are free to use link level retransmission whenever they consider it necessary. However, one of the beauties of datagram networks is that they don't HAVE to use link level retransmission where it doesn't make any sense (Ethernets, DDS lines with 1 in 10^9 error rates, etc). Packet radio (amateur or otherwise) is one of the few places where link level acknowledgments really do make sense. > 2. Congestion is a serious problem in datagram systems. No really good > general solutions are known. I've solved some problems associated with > some of the simpler cases (IP/TCP via Ethernet to slow link gateways) > but a general solution is still elusive. There are tough theoretical > problems here; there may be a way to organize an arbitrarily large > datagram network, but it hasn't been discovered yet. Telephony > has been around long enough that we know how to build very large > virtual circuit networks. Congestion is a serious problem in any network that depends on well-behaved user statistics, be it virtual circuit, datagram or simple circuit switched. I could make a virtual circuit network go into "congestion collapse" just like an IP network, assuming that I have a transport protocol in each case. I merely set the reset-and-re-establish VC timer in my transport protocol to be short enough that it frequently clears and re-establishes the underlying VC, preferably ten or twenty times for each successful packet delivery. Considering that most VC networks assume virtual circuit setups to be rare events (some even have central nodes performing all circuit setup and teardown operations) I think this could cause a lot of havoc. Telephony has been around a long time, but we still don't know how to build a large circuit switched network (virtual or otherwise) that isn't susceptible to congestion collapse. Just see the notes on net.ham-radio about what happened to phone service in Tucson AZ during the recent TAPR TNC sale. The only guaranteed way out is to have enough network resources for the absolute worst possible case. In most long haul networks this is clearly out of the question, so you just try to deal with it as best as you can. > 3. Datagram networks tend to break down when fully loaded; this is a > consequence of (2) above. There are ways around this, but they > involve running the system in a derated mode, where keeping all > links as busy as possible is not attempted. The ARPANET technology > really works only because the ARPANET has substantially more link > bandwidth than it needs for its traffic volume; this is > a well known problem. TELENET started out with ARPANET technology > but has since gone to virtual circuits internally to get better link > utilization. In any case, the IMP system of the ARPANET is not > a true datagram system internally, although it exports a datagram > interface. I don't understand this comment. It is at least possible to re-route excess traffic around a congested area when datagrams are used. If you have N virtual circuits established through a given fixed route, there's not much you can do if all N users decide to send simultaneously, overloading the links along the route. Of course, you could statically allocate link bandwidth and buffer space for each virtual circuit, something that is difficult to do in a datagram network. However, this defeats the whole point of packet switching, namely the statistical sharing of resources. If you really want to guarantee throughput once a connection is established, build a pure circuit-switched network; if you want the guaranteed ability to establish a connection at any time, put in a leased line. TELENET went to VCs internally for two reasons: a) they only had to provide a virtual circuit service, X.25; b) the bulk of their traffic consists of single character packets from people typing on dumb terminals. In this case the larger datagram headers were the deciding factor. > 4. Datagram systems have some serious vulnerabilities. One bad guy can > hog the network and clog up the links. Datagram systems tend to > rely on hosts being well-behaved. With virtual circuits, the network > has a positive throttle over host traffic generation, and can keep > bad hosts from interfering with other traffic. In networks with > no central administrative authority over hosts, this is a serious > problem in practice. The ARPANET/MILNET gateways are already > under serious strain because of this exact problem. Tight standards > and anti-bad-guy queuing algorithms in nodes can solve this > problem; unfortunately the Internet lacks both. Not unlike virtual circuit networks. The IP "source quench" is a protocol; unfortunately many hosts refuse to adhere to it. I could also refuse to adhere to X.25 and send traffic outside of my agreed-apon window, for example, or I could (and do!) establish additional virtual circuits to my destination to circumvent the much-touted per-VC network flow control ability. The only answer in either case is to cut off hosts that don't play by the rules, but this is an implementation problem, not a problem with the protocols. > 5. Accounting is difficult in datagram systems. What should a phone > bill for a datagram net look like? Histograms of traffic by > time and destination? Just a total amount? The network may need > to recognize clumps of packets for similar destinations and treat > them as a ``call'' for billing purposes. > PDNs already charge for both connect time and for packets sent. (I've sometimes suggested, only half in jest, that the real reason they don't like datagram services is because they'd no longer be able to charge for connect time.) Since most datagram traffic would continue to be "clustered" to a small set of destinations, I don't see any problem with billing by per-destination packet counts in the local switch. TELENET punts the issue anyway, since their charges are distance-independent. I guess we don't really disagree on what needs to be done to make datagram networks like the Internet behave well under loads as they grow. My suggestions are as follows: 1. Implement mechanisms to "punish" hosts that misbehave by ignoring ICMP source quench messages. 2. Make sure that each packet switch has more than enough buffer memory to handle all but extremely unusual peak traffic bursts. The older IMPs and IP gateways are probably the major offenders in this regard. I suspect that memory-starved IP gateways account for the vast majority of dropped datagrams (ignoring causes such as unreachable destinations, of course.) Regardless of the protocol, the laws of queuing theory still apply. If you use an internal flow control mechanism to avoid dropping packets in a memory-starved packet switch, you won't be able to utilize your outgoing link as efficiently. The larger the queue on your outgoing link, the closer you'll be able to approach 100% utilization. 3. Use link level acknowledgements only on those paths (radio, dialup modems) that are unreliable enough to justify them. Better yet, do something to improve the raw error rate on the links. Get rid of link acknowledgments on all other paths to improve link efficiency. 4. Once the above steps are taken, the dropped packet rate should fall to a very low value. Once this happens, it should be possible to convince TCP implementers to lengthen their retransmission timers significantly to avoid congestion collapse when round trip delays jump because of sudden load. If you can send a datagram with a very high degree of confidence that it'll get there (eventually), people won't be tempted to use such trigger-happy retransmission timers. Phil Karn