Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!emory!att!ucbvax!ANDREW.CMU.EDU!ww0n+ From: ww0n+@ANDREW.CMU.EDU (Walter Lloyd Wimer III) Newsgroups: comp.protocols.tcp-ip Subject: Re: Message compression Message-ID: Date: 5 Apr 91 04:50:55 GMT References: Sender: daemon@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 36 > Excerpts from internet.tcp-ip: 4-Apr-91 Re: Message compression Frank T. > Solensky@ucsd.e (1666) > One of the problems with a number of data compression algorithms > (eg: Lempel-Ziv encoding, the one used by the Unix 'compress' command) > is that they need to be able to look at the entire data stream before > being able to compress any part of it. In this case, it would be about > the same as running compress yourself and then FTPing the resulting file. From empirical evidence, I don't believe this is true. The 'compress' program can read from a pipe. I've used this feature to create a compressed tar file of a directory tree in a single step: tar -cf - somedirectory | compress -c > somedirectory.tar.Z Granted, it could be buffering quite a bit of data in virtual memory, but I doubt it was buffering the 90 megabytes worth of data from one particular tar I remember. (My system only has 64 megs of swap space. . . .) Recently, there was also an excellent posting to the comp.sources.unix newsgroup concerning compression techniques. It included a draft of a paper on the workings of various compression algorithms, including a newly-invented variant of Lempel-Ziv which the author seems to claim is free of patent (he calls it "Y-coding"). While I know and understand very little about compression techniques, a brief reading of the paper seems to suggest that these compression techniques work quite well even applied to (relatively small) finite-sized chunks of data. An implementation of the new algorithm is included in the posting. Walt Wimer Network Development Carnegie Mellon University