Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!aplcen!aplcomm!uunet!mcsun!corton!chorus!nocturne.chorus.fr!jloup From: jloup@nocturne.chorus.fr (Jean-Loup Gailly) Newsgroups: comp.compression Subject: Re: Proposed standard: Replies to criticisms. Keywords: data compression proposed interface standard criticisms replies Message-ID: <11228@chorus.fr> Date: 28 Jun 91 12:34:23 GMT References: <899@spam.ua.oz> Sender: jloup@chorus.fr Reply-To: jloup@nocturne.chorus.fr (Jean-Loup Gailly) Organization: Chorus systemes, Saint Quentin en Yvelines, France Lines: 57 ross=ross@spam.ua.oz.au (Ross Williams) dan=brnstnd@kramden.acf.nyu.edu (Dan Bernstein) dan> [ streams versus blocks ] dan> I think the real argument here is over whether the compressor should dan> drive the I/O or vice versa. In modem applications, your standard is dan> fine: the I/O has to drive the compressor. But in most applications it's dan> much easier to work the other way round. ross> This may be so, but placing the compressor in charge means that you ross> have to tell it how to do IO, which is so messy in most programming ross> languages (requiring Ada like generic superroutines or C pointers to ross> IO functions) as to as to completely destroy the standard. I agree with Dan. When the compressor drives the I/O, it keeps its internal state when it outputs a block of compressed data and continues with the rest of the input data. If the compressor is forced to stop when the single output buffer is full, it has to save its internal state explicitly in the MEMORY parameter suggested by Ross. If the compressor state is the middle of a deeply nested loop, it becomes messy to restore this state at the next invocation of the compression routine. (The same arguments apply to decompression.) Even for modems, I don't see why the compressor cannot drive the I/O. This may require some buffering of the data to be compressed sent by the modem, but the OS usually takes care of this. If the compressor is so slow that it cannot handle the input rate on the average, you are in trouble anyway. If we ignore the details about setting the tty modem line correctly and determining the end of the input data, a command such as compress < /dev/ttyxxx > foo should work. In this example, compress drives the I/O. You don't even need pointers to IO functions. The compression/decompression algorithm can directly call functions readblock/writeblock which would have an interface similar to that of the Unix read/write calls and would be well specified by the compression standard. The compressor can easily construct macros (sorry, inline functions) on top of these functions, similar to the getc/putc macros. For a modem, readblock would return whatever bytes have already been received. For many algorithms, this interface is much easier to use than that proposed by Ross. Also, Ross justifies his unique function for compression/decompression by an array of function pointers: "this is a situation where programs may wish to keep arrays of these algorithms". Either you do or you do not have pointers to functions, but you cannot have it both ways. Although I said above that it is preferable to have the compressor drive the I/O, this does make it more difficult to try multiple compression algorithms concurrently. If your language does not give you concurrent threads or coroutines, the easiest solution is to read the input file once per algorithm. But this is costly for very high speed algorithms which are IO bound. Also, this is not applicable if the input data must be read only once (modem or pipe). Dan, do you have a suggestion for this? Jean-loup Gailly jloup@chorus.fr