Path: utzoo!attcan!uunet!tut.cis.ohio-state.edu!ucbvax!GATEWAY.MITRE.ORG!barns
From: barns@GATEWAY.MITRE.ORG
Newsgroups: comp.protocols.tcp-ip
Subject: Re:  partial transfer recovery in RFC and OSI protocols
Message-ID: <8912181942.AA10029@arcturus.mitre.org>
Date: 18 Dec 89 19:42:08 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Organization: The Internet
Lines: 59

You raise several distinct issues of which I can only respond to some.

Restart capability is defined for the FTP protocol (RFC 959) as an
optional feature and has been so since at least RFC 542, 12 August
1973, which is as far back as my collection goes.  This design works
between divergent system types.  The written specs had some unclear
areas which I hope have been fixed by text in RFC 1123, section
4.1.3.4.  Implementations are few and far between, but it was hoped
that the clarified, corrected and expanded writeup in RFC 1123 would
encourage people to take this on.  It is not very hard to do.

There is a different FTP restart scheme by Rick Adams which I have
heard will be in 4.4(?).  It is not compatible with the one defined in
the RFCs, though in principle they can coexist in a single
implementation.  It fits better with a typical UNIX I/O architecture
and thus perhaps gives better throughput and almost certainly is more
CPU-efficient on a number of real world platforms, but it does not try
to handle the more difficult cases of operation between divergent
systems.

The throughput issue regarding the two methods is architecture-dependent
and involves both software and hardware design issues.  It seems that
making the RFC version work using the normal interface routines one
finds on a UNIX box probably means either sending more packets or doing
more data copying.  In some cases it might be possible to just add
library routines to eliminate this, if the hardware has a suitable
design (i.e., scatter/gather DMA).  Also, in some systems, some of the
feared data copying may be happening already anyway.  On some machine
architectures including IBM Big Iron (but there are others too), a
relatively small amount of code together with some smart choices of
block sizes might allow you to do the RFC-style approach with
infinitesimal impact on CPU utilization or network throughput.  The
upshot of all this seems to be that neither version of Restart
maximizes interoperability, portability, and (CPU) efficiency
simultaneously.  I think OSI is trapped in the same solution space.

In either version, the Restart mechanism is basically provided for
recovery from cataclysmic disruptions (disk full, network died, host
died, impatient user blasted the client program into oblivion) and NOT
to deal with bit corruption (noisy links).  Both TCP/IP and OSI hold
that lower layers should do most of the work of protecting against the
latter problem.  I don't find this unreasonable, even on slow serial
point-to-point links.  Data link protocols should be chosen to fit the
error characteristics of the links, and TCP and TP4 can cope with some
residual glitches.  This leaves only the problem of recovery from
higher-layer aspects of service interruptions as the proper domain of
"session" recovery schemes.

Regarding the life and death of TCP/IP family protocols and their
enhancement, I agree that they aren't dead yet and can't be ignored.
However I suggest that there is only one pool of expertise available
for working on generic application domain problems in either TCP/IP or
OSI.  Seems to me that most of the experts with strong feelings are
spending most of their time in the OSI arena.  There are evidently not
enough people to go around, and those that exist evidently see OSI as a
better investment of their time right now (I presume on the theory of
broader impact).

Bill Barns