Path: utzoo!attcan!uunet!unisoft!hoptoad!cfcl!dwh
From: dwh@cfcl.UUCP (Dave Hamaker)
Newsgroups: comp.protocols.misc
Subject: Re: About Protocols for File Transfer
Summary: Yes, a protocol transfer can be as fast as non-protocol
Message-ID: <304@cfcl.UUCP>
Date: 31 May 88 05:40:57 GMT
References: <303@cfcl.UUCP> <698@lakesys.UUCP> <692@ncrcce.StPaul.NCR.COM> <9295@eddie.MIT.EDU> <8WbMLYy00Vs8EzltB4@andrew.cmu.edu>
Reply-To: dwh@cfcl.UUCP (Dave Hamaker)
Organization: Canta Forda Computer Laboratory, Pacifica, CA
Lines: 110

<303@cfcl.UUCP> <698@lakesys.UUCP> <692@ncrcce.StPaul.NCR.COM> <9295@eddie.MIT.EDU> <8WbMLYy00Vs8EzltB4@andrew.cmu.edu>

In article <303@cfcl.UUCP> I ask the trick question:
>
>Using ordinary asynchronous RS-232 full-duplex serial communications, is an
>error-detecting/correcting file-transfer protocol possible which is as fast
>as or faster than non-protocol transfer?  If not, why not?  If so, how?

In article <698@lakesys.UUCP> mikep@lakesys.UUCP (Mike Pluta) suggests the
use of MNP modems and the like.  This is a pretty good answer.  MNP modems
can, in fact, outperform non-protocol transfer.  However, they do this by
using synchronous communications, saving enough in start/stop bits to more
than offset the added protocol overhead.  This violates the stipulation in my
question for asynchronous communications.  To restore a raw speed advantage
to non-protocol transfer, just allow it the use of synchronous communications.

In article <692@ncrcce.StPaul.NCR.COM> cavanaug@ncrcce.StPaul.NCR.COM (John
David Cavanaugh) writes that non-protocol transfers are slower when there are
errors because it takes so much time to send the whole thing over.  This was
also the message of two people who sent me email.  It's a good point, but it
does not answer the question I was trying to communicate.

In article <9295@eddie.MIT.EDU> jbs@fenchurch.MIT.EDU (Jeff Siegal) answers
no, because additional error-checking/correcting information must accompany
the data for error-detection/correction to be possible.  Another person who
sent me email came to the same conclusion.  This is the answer I expected.
Jeff Siegal deserves credit for putting it forward; there was a hint I had
something up my sleeve.

In article <8WbMLYy00Vs8EzltB4@andrew.cmu.edu> jk3k+@ANDREW.CMU.EDU (Joe
Keane) answers yes, because a windowed protocol can overlap returning ACK's
with forward transmission.  Maybe he interpreted "as fast as" to mean
sending characters at the same rate.  He does not mention the need for
additional checking information, and thus provides the receiver no way to
determine the correctness of the data.  Yet it is supposed to send back
"block n OK" messages.

Protocol transfers and non-protocol transfers have no speed advantage over
each other in terms of the raw data to be sent.  Since the non-protocol
transfer sends nothing else, protocol transfers cannot be faster.  If it
is true that additional information must be communicated for checking to
be possible, and it is, how can protocol transfers avoid being slower?
THE CHECKING INFORMATION CAN TAKE THE FORM OF FEEDBACK; IT CAN TRAVEL IN
THE OTHER DIRECTION (full duplex, remember?)!  This by itself will not
quite reach "as fast as" performance.  Some "attention" signal is needed
which is distinct from data.  Asynchronous RS-232 even provides this, in
the form of the "break."

An example protocol should make this clear:

The data is sent non-stop.  It is divided, conceptually, into packets of
some specific size.  Packets are numbered consecutively modulo 256.  A
16-bit CRC is calculated for each packet, prefixed by its sequence number
(the sequence number byte is pumped into the CRC generator before the
first data byte).  As each packet is received, four bytes of check infor-
mation are returned in the opposite direction (overlapping the receipt of
the next packet).  The first byte of the check sequence is something
specific (like SOH) for framing.  The second byte is the sequence number
of the associated packet.  The third and fourth bytes are the CRC.

A break signal causes the receiver to return the check information for the
current short packet (if any), return its own break signal, and react to
the next input character.  It will either be another break, signaling the
end of the transmission; or a sequence-number byte followed by two CRC
bytes on the sequence number, followed by retransmitted data (at this
point the receiver is back to normal).  If the CRC of the sequence number
received is wrong, the receiver sends back a break and ignores data until
another break is received.  A break is also the receiver's final response
to the break which marks the end of the transmission.

A "transmission window" is also needed.  This extends some number of
packets beyond the first packet which is waiting for check information to
be returned. The window is advanced as check information is verified.  The
sender may not send data outside the window and the receiver may deduce
from the highest packet received which earlier packets may be presumed
correct (and written to disk, etc.).  There is no framing information with
the data, and the receiver should actually act as if the window is a little
larger that it actually is.  This prevents things like phantom characters
caused by line noise from advancing the receiver's view of the window to
the point where the receiver would be misled to accept an unverified packet.
The window needs to be large enough to smooth out the effect of transmission
delays.

I'm sure everyone will agree that this protocol meets the "as fast as"
requirement, although it is possible to quibble.  A non-protocol transfer
doesn't need the two break signals at the end.  For that matter, the sender
has to wait for all outstanding check data before sending the final break
(as was pointed out in email from a person who knew more or less what I was
up to, from prior contact with my ideas).  But, then, how DO you tell when
a non-protocol transfer is finished?  How long does THAT process take?

The example protocol makes a point, but I wouldn't propose it as a practical
protocol.  Facilities for handling break signals are often deficient, for
one thing.  There's no flow control, no timeout for lost data, and so forth.
Why bring the subject up then?  Well, I had two reasons:

    1. I think it's good to be reminded that looking at problems in
       unconventional ways can result in solutions for the seemingly
       impossible problem.
    2. I wanted to give some publicity to my design for a file-transfer
       protocol which takes advantage of these ideas, but which is
       intended for real world use.  I'll email the specification
       document to anyone who asks (I'm reluctant to post it: it's
       about fifty pages long, would require a multi-part posting,
       and I'm not sure there is a good place to post something like
       it).  Ideally, I'm hoping to hook up with anyone out there in
       netland who might be interested in prototyping the design.

-Dave Hamaker
...!ucbvax!ucsfcgl!hoptoad!cfcl!dwh