Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!aplcen!aplcomm!uunet!mcsun!corton!chorus!nocturne.chorus.fr!jloup
From: jloup@nocturne.chorus.fr (Jean-Loup Gailly)
Newsgroups: comp.compression
Subject: Re: Proposed standard: Replies to criticisms.
Keywords: data compression proposed interface standard criticisms replies
Message-ID: <11228@chorus.fr>
Date: 28 Jun 91 12:34:23 GMT
References: <899@spam.ua.oz>
Sender: jloup@chorus.fr
Reply-To: jloup@nocturne.chorus.fr (Jean-Loup Gailly)
Organization: Chorus systemes, Saint Quentin en Yvelines, France
Lines: 57

ross=ross@spam.ua.oz.au (Ross Williams)
dan=brnstnd@kramden.acf.nyu.edu (Dan Bernstein)

dan> [ streams versus blocks ]
dan> I think the real argument here is over whether the compressor should
dan> drive the I/O or vice versa. In modem applications, your standard is
dan> fine: the I/O has to drive the compressor. But in most applications it's
dan> much easier to work the other way round.

ross> This may be so, but placing the compressor in charge means that you
ross> have to tell it how to do IO, which is so messy in most programming
ross> languages (requiring Ada like generic superroutines or C pointers to
ross> IO functions) as to as to completely destroy the standard.

I agree with Dan. When the compressor drives the I/O, it keeps its internal
state when it outputs a block of compressed data and continues with the rest of
the input data. If the compressor is forced to stop when the single output
buffer is full, it has to save its internal state explicitly in the MEMORY
parameter suggested by Ross. If the compressor state is the middle of a deeply
nested loop, it becomes messy to restore this state at the next invocation of
the compression routine. (The same arguments apply to decompression.)

Even for modems, I don't see why the compressor cannot drive the I/O. This may
require some buffering of the data to be compressed sent by the modem, but the
OS usually takes care of this. If the compressor is so slow that it cannot
handle the input rate on the average, you are in trouble anyway.  If we ignore
the details about setting the tty modem line correctly and determining the end
of the input data, a command such as

  compress < /dev/ttyxxx > foo

should work. In this example, compress drives the I/O.

You don't even need pointers to IO functions. The compression/decompression
algorithm can directly call functions readblock/writeblock which would have an
interface similar to that of the Unix read/write calls and would be well
specified by the compression standard. The compressor can easily construct
macros (sorry, inline functions) on top of these functions, similar to the
getc/putc macros.  For a modem, readblock would return whatever bytes have
already been received. For many algorithms, this interface is much easier to
use than that proposed by Ross.

Also, Ross justifies his unique function for compression/decompression by an
array of function pointers: "this is a situation where programs may wish to
keep arrays of these algorithms". Either you do or you do not have pointers
to functions, but you cannot have it both ways.

Although I said above that it is preferable to have the compressor drive the
I/O, this does make it more difficult to try multiple compression algorithms
concurrently. If your language does not give you concurrent threads or
coroutines, the easiest solution is to read the input file once per algorithm.
But this is costly for very high speed algorithms which are IO bound. Also,
this is not applicable if the input data must be read only once (modem or
pipe). Dan, do you have a suggestion for this?

Jean-loup Gailly
jloup@chorus.fr