Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.3 4.3bsd-beta 6/6/85; site decwrl.DEC.COM
Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!decwrl!dec-rhea!dec-koning!koning
From: koning@koning.DEC (Paul Koning -- LAS Engineering)
Newsgroups: net.ham-radio.packet
Subject: Re: Link layer thoughts
Message-ID: <1655@decwrl.DEC.COM>
Date: Thu, 13-Mar-86 12:33:06 EST
Article-I.D.: decwrl.1655
Posted: Thu Mar 13 12:33:06 1986
Date-Received: Sat, 15-Mar-86 01:46:51 EST
Sender: daemon@decwrl.DEC.COM
Organization: Digital Equipment Corporation
Lines: 111

On 23-Dec-1985, Phil Karn (KA9Q) posted a paper about ACK acknowledgement
as a way to improve performance on noisy channels.
 
After thinking about the details for a while, I feel that the analysis
is inaccurate and that there is in fact little benefit from the proposal;
it seems that the ALOHAnet people didn't think things through all the way.
Here's why:
 
The basic notion is that one should not retransmit Data messages if it
is not necessary to do so.  The main reason is that data messages are
large and it is better to transmit small control messages instead.
There are two cases where the sender of a data message fails to receive
an ACK:
 
1. The data message was lost.  In this case, the receiver never sent
   the ACK, and the optimal recovery is to resend the data.
 
2. The data message was received but the ACK was lost.  In this case,
   the optimal recovery avoids resending the data.
 
It's pretty obvious (and Phil shows the details) that detecting the
second case as a special case is worthwhile IF you assume that the
two cases have equal probability.  Phil also points out that they in
fact do not have equal probability but doesn't follow this through.
 
Actually, the probability of a packet being in error is approximately
proportional to the packet size, since the BIT error rate is constant.
Ignoring digipeaters, an ACK is 19 bytes long and a full-size data
packet is 276 bytes long.  This means the probability of case 1 is
more than ten times that of case 2.  So for more than 9 out of 10
lost ACKs, the recovery as specified in AX.25 is in fact the optimal
one.
 
This is the worst case analysis, of course.  If you have digipeaters
(which are a declining feature) or shorter data packets, ACKs are
not as small compared to the data packets, and the lost-ACK case
makes up a larger fraction, but rarely would it be even close to
50% of the total.
 
So in short, I'm not convinced this is a case worth worrying about.
But if it is, there is a better alternative than ACK acking.
Here's why:
 
ACK acking requires both the sender and receiver to do timeouts.
The receiver has to keep a timeout >= the round trip delay -- this
is used to generate retransmitted ACKs.  The sender has to use a
timeout which is at least twice or three times the receiver's timeout
since otherwise the sender would retransmit the data before the receiver
gets a chance to retransmit the ACK.
 
The consequence of having the sender use this extra long timer is that
in the lost data packet case (which is the majority case, as discussed
above) the sender waits far longer than optimal before retransmitting.
 
Another problem is that there are two timers, one at each station,
whose values are tied together; best performance requires that the
one is somehow maintained as 2 or 3 times the other so long as the
connection is up.
 
An alternative approach is the one used in DDCMP.  Here, the receiver
does not have to keep a timer.  When the sender fails to receive
an ACK for its data packet, rather than resending the data packet it
sends a special control packet (REP, for "Reply requested").  When
the receiver gets the REP, it responds by resending the latest ACK.
The sender then can decide which data packets (if any) have to be
retransmitted.
 
This alternative is better because it does not require the sender
to keep an artificially long timeout.  The sender can use the optimal
timeout (slightly greater than the round trip delay).  Also, it is
simpler since the receiver does not need to maintain a timer of
its own.
 
On a slightly different subject...
 
Near the end of his message, Phil describes an alternative protocol
that does not use sequence numbers.  I agree that the sliding window
protocol as in AX.25 is excessively complex (indeed AX.25 is far
too complex in lots of places and yet has some serious omissions, but
that's for another note).  But you can't build a correct protocol
without sequence numbers if you use retransmission for error
recovery.  As a minimum, you need a one-bit sequence number,
as is done in BISYNC.  This is necessary because otherwise you can't
distinguish these two cases:
 
1. A sends data packet 1, B acks it, A receives ack, sends data packet 2
 
2. A sends data packet 1, B acks it, ack is lost, A times out and
   retransmits data packet 1.
 
Note that ACK acking doesn't eliminate the problem since case 2 still
applies if 2 or 3 acks in a row are lost and A eventually times out
anyway.
 
Phil appears to suggest that you can tell the two apart by having the
receiver keep a copy of the last received data packet and comparing
this with the next packet.  Apart from the fact that this is costly
(having to do a string comparison on each packet) it doesn't work
because it is perfectly legal to send two consecutive packets with
the same data fields.
 
So minimally you need 1 bit in each direction for a sequence number.
AX.25 uses 3 bits, which really makes no difference; the amount of
state to be maintained in AX.25 per connection is no different
(give or take a byte or two) from that required for the "bare bones"
protocol.  The reason to argue against AX.25 is not the amount of
state info required but rather the unnecessary complexity and the
errors in the protocol.
 
73,
	paul koning, pa0pkg/w1