Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!tut.cis.ohio-state.edu!ucbvax!GATEWAY.MITRE.ORG!barns
From: barns@GATEWAY.MITRE.ORG (Bill Barns)
Newsgroups: comp.protocols.tcp-ip
Subject: Re: partial transfer recovery in RFC and OSI protocols
Message-ID: <8912210357.AA06303@gateway.mitre.org>
Date: 21 Dec 89 03:57:04 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Organization: The Internet
Lines: 66

I disagree because I don't believe that the byte number in the transfer
stream is sufficient information to determine how to join the data sent
during the restarted transfer with the data sent the first time in
every imaginable case.

There would be no problem if the bytes in the transfer stream were
literally stored in the file.  This is the case in image transfers
between 8-bit-byte machines, so Rick's method should be able to be
successfully implemented for such a case on any system type.  There is
not much problem if the bytes were stored according to some
transformation of bit sequences which can be reliably inverted.  This
is pretty much true of ASCII transfers between UNIX systems and also
many others, although I'm not so sure it is strictly true if the file
being transferred contained a "naked LF" in the part that made it the
first time.  I defer to people who know the code better than I, but I
got the impression that if a client on a non-UNIX does a STOR onto a
UNIX server of a file containing a naked LF and the session dies
somewhere after the naked LF is stored but before the end of the file,
then when the client tries to restart later, it must use the SIZE
command to get the value to be put into the REST command, and the
server cannot tell the naked LF from LF's that were created out of CR
LF sequences, so it will return a size one higher than the actual byte
count received over the data connection. (?)

Besides non-invertibility problems, I suspect the existence of
situations where the state of the receiving FTP's data transformation
state machine cannot be recreated for points in mid-file in a new
session.  With image mode I think this cannot be a problem, but for
other modes it is possible that transformations such as the end-of-line
transforms used by various systems may result in the server having
state information not represented on disk.  Probably in most cases, the
state information can be synthesized at least for some points in the
file, and if so, then fudging the answer to the SIZE command (if file
was being STORed on server) or backing up the REST value based on
scanning the local file (if it was being RETRieved to client) would
enable this method to work OK, provided you can identify some such safe
point in the partial file.

A pragmatic concern for an implementor is to understand the system's
behavior when it crashes while a file is being stored.  If the byte
count can be left out of sync with the data write, a restart might give
bad results.  If the data is always made non-volatile before the byte
count is updated, this will not be a problem.  This sounds like
something the OS "ought to do right" but they sometimes don't.  (They
can also be helped to screw up by hyper-clever disk latency
optimizations or misbegotten network file systems that handle caching
in some way that might reorder these writes.)  I know of no way to
avoid all such problems, but it is probably easier to hack around known
misbehavior with the explicit restart marker method than with implicit
markers.  For example, a server might delay sending its 110 replies by
some interval and then return the byte count in the marker.  This
knowledge would then be sitting on the client end where a server crash
could not clobber it.  For a client crash while retrieving, I suppose
that the client just has to restart at some earlier point than the
filesystem claims it needs to.  This should work equally well (badly)
with either restart scheme.  For really strong assurance of integrity,
you would probably need to run checksums over the files at both ends.

I hope no one will construe this discussion as some sort of "disproof"
of Rick Adams's approach; it isn't one.  It's meant to be an
illustration that the method in the RFCs has relative advantages in
some situations, as Rick's has in many others.  Neither one seems to be
perfect or dominant in every way; either we haven't gotten smart enough
yet to do that, or the problem has no such solution.

/Bill Barns