Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!usc!ucsd!ucbvax!BBN.COM!jzinky
From: jzinky@BBN.COM ("John A. Zinky")
Newsgroups: comp.protocols.tcp-ip
Subject: Traffic Sensitive SPF Routing is NOT too hard!
Message-ID: <8911302038.AA19457@ucbvax.Berkeley.EDU>
Date: 30 Nov 89 17:33:13 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Organization: The Internet
Lines: 95

I would like to give my opinion on traffic sensitive routing from the
experience of running large over-subscribed networks, such as the 1987-88
ARPANET. (More details can be found in SIGCOM '89 and MILCOM '89 articles by
Khanna and Zinky)

Summary: 
GIVEN THE GLACIAL PROCUREMENT TIME FOR NEW EQUIPMENT, 
TRAFFIC SENSITIVE ROUTING IS MANDATORY FOR LARGE PACKET-SWITCHED NETWORKS.
Overall, traffic sensitive SPF routing is NOT HARD to implement once you
have the flooding mechanisms to handle failed equipment. Tuning the
algorithm is tricky, but it is less effort than fighting the fires
associated with over-subscribed lines.


In the good old days the ARPANET (pre-1985) had a peak-hour average
line utilization of less than 15% and everyone was happy. (Much like
the current NFSnet.) But due to the budgeting decision beyond the
control of techies, the 1987 ARPANET had an average line utilization
greater than 30% with some lines reaching 80%.  Also, traffic was
growing at a rate of 30% per year and no new bandwidth was scheduled.
Its nuts to run a packet network above 15%, unless you have heavy
controls in terms of dynamic routing and congestion control. We spent
a lot of effort coming up with reasonable schemes which are now
deployed in several large networks.

If you look at Capacity Management in terms of time-scale and
function, you can see that network design, routing, and congestion
control complement each other. Network design plans ahead and matches
expected traffic to available resources. It works on the time scale of
equipment procurement (months to years). Routing is an allocation
policy that maps traffic flows onto available resources. It works on
the time scale of network operations (minutes to days).  Congestion
control regulates user traffic so that resources are not
oversubscribed. It works on the time scale of a few round trip times.

Routing's goal is to make the best allocation of bandwidth given the
current user traffic. When the network is under-subscribed, minhop
routing is fine and all you have to worry about is equipment failure.
For over-subscribed networks, it is tempting to use routing to do some
load balancing.

The interpretation of the old delay-based SPF routing given in the above
papers shows that:

1) Traffic sensitive routing is possible in a network of 220+ nodes with
peak-hour average line utilizations in the 15%-40% range. 

2) Due to shifting traffic patterns and long lead-time for procuring
bandwidth, there WILL BE lines that are > 80% utilized during the entire
peak hour.  Traffic sensitive routing can shed some of this load to other
trunks. (The traffic flows that have only slightly longer alternate paths
will be shed first from the line. While the traffic with much longer alternate
paths will remain on the line.) The specific algorithm used in the
ARPANET/MILNET can handle individual over-subscribed lines with 100%-300%
offered minhop load.  (The extra traffic is moved to under-subscribed
lines).

3) Routing discussions in the time frame of 10 seconds is good enough
for load balancing aggregate traffic loads, but congestion control is still
needed to handle individual fluctuations.


RECOMMENDATIONS

1) If you can afford to run you network with peak resource
utilizations of less than 15%, then forget about traffic sensitive
routing, minhop is good enough.

2) If you plan to use any form of SPF routing on an over-subscribed network:

A) Allow for a "link" weight that has a granularity of at least a 1/10
of a hop. In the ARPANET, 25% of the traffic on a link has an
alternate path with the same number of hops. For example suppose that:
all the links report a weight of 1 and one link reports 1+epsilon,
then 25% of the link's traffic will be shed. It is important to spread
out this abrupt change in traffic allocation by having a finer
granularity in link metric and making sure that links are not all
reporting exactly the same value.

B) The range of the link weight should not go above 4 hops. For the ARPANET
topology and traffic pattern, if a line reports a weight of more than 4
hops, most/all of the traffic will be have an alternate path that will avoid
the line, i.e. no traffic will flow over the link!

C) If you cannot change the link weight automatically, at least let it be a
configurable parameter. But be prepared, it will be a full time job tuning
these link weights.

D) With a some effort, you can use the link weights to handle heterogeneous
link characteristics.



zinky
"Be careful, it's a real world out there"


Brought to you by Super Global Mega Corp .com