Xref: utzoo comp.sources.d:2164 comp.binaries.ibm.pc.d:303 Path: utzoo!attcan!uunet!husc6!bloom-beacon!tut.cis.ohio-state.edu!mailrus!um-math!hyc From: hyc@math.lsa.umich.edu (Howard Chu) Newsgroups: comp.sources.d,comp.binaries.ibm.pc.d Subject: Re: compressing compressed stuff Message-ID: <336@clio.math.lsa.umich.edu> Date: 24 May 88 20:33:21 GMT References: <292@cullsj.UUCP> <696@fig.bbn.com> <4744@teddy.UUCP> <5198@umn-cs.cs.umn.edu> Sender: usenet@math.lsa.umich.edu Reply-To: hyc@math.lsa.umich.edu (Howard Chu) Organization: University of Michigan Math Dept., Ann Arbor Lines: 48 UUCP-Path: {mailrus,umix}!um-math!hyc Just thought I'd play around a bit and see what all this meant... The following summarizes a few minutes of messing around with uuencode, compress, and compact on a Sun 3/260. While I'm only testing a single file, I'm sure it makes a pretty convincing worst case test... For reference, compress uses a 16 bit Lempel-Ziv-Welch compression scheme, and compact uses an optimized Huffman Squeeze algorithm (which doesn't store the decoding tree in the compacted file). This can almost be directly related to the ARC program, with the exception that ARC performs run-length-encoding on input data before feeding to any of the other compression algorithms. (PKARC doesn't do this, by the way.) 582556 May 24 15:28 vmunix plain binary file 472774 May 24 15:36 vmunix.C compacted. (huffman squeeze) 661987 May 24 15:46 vmunix.C.uue compacted, then uuencoded 571778 May 24 15:46 vmunix.C.uue.Z compacted, uuencoded, compressed 365675 May 24 15:28 vmunix.Z compressed (16 bit) 358631 May 24 15:39 vmunix.Z.C compressed, then compacted 502186 May 24 16:01 vmunix.Z.C.uue compressed, compacted, uuencoded 449395 May 24 16:01 vmunix.Z.C.uue.Z " " , compressed 512047 May 24 15:30 vmunix.Z.uue uuencoded after compression 445229 May 24 15:30 vmunix.Z.uue.Z compressed, uuencoded, compressed again 815678 May 24 15:28 vmunix.uue uuencoded, no compression 462100 May 24 15:31 vmunix.uue.Z compressed after uuencoding 460239 May 24 15:43 vmunix.uue.Z.C uuencoded, compressed, then compacted A few things worth noting: - while the results aren't always dramatic, (and they certainly aren't, in this case) a Huffman Squeeze will always reduce the size of a file already compressed by some form of Lempel-Ziv compression. - compressing, then uuencoding, is obviously better than just uuencoding. - since Lempel-Ziv compression typically yields 50% compression, and uuencoding gives about 33% expansion, the result will still be smaller than the original file. - if your news software also tries to perform compression, it's still a good idea to compress, then uuencode. Compare: 445229 May 24 15:30 vmunix.Z.uue.Z 462100 May 24 15:31 vmunix.uue.Z - there is no vmunix.Z.Z or vmunix.C.Z in the above list. Immediately recompressing a compressed file is always a bad idea. Your mileage will vary.... -- / /_ , ,_. Howard Chu / /(_/(__ University of Michigan / Computing Center College of LS&A ' Unix Project Information Systems