Xref: utzoo comp.binaries.ibm.pc.d:1359 comp.sys.ibm.pc:21311 Path: utzoo!attcan!uunet!yale!spolsky-joel From: spolsky-joel@CS.YALE.EDU (Joel Spolsky) Newsgroups: comp.binaries.ibm.pc.d,comp.sys.ibm.pc Subject: Re: WARNING! Vicious bug in GSARC Keywords: GSARC, Archivers, Compression, Bugs Message-ID: <43332@yale-celray.yale.UUCP> Date: 17 Nov 88 03:53:58 GMT References: <6627@watcgl.waterloo.edu> <42285@yale-celray.yale.UUCP> <6634@watcgl.waterloo.edu> <1413@neoucom.UUCP> Sender: root@yale.UUCP Reply-To: spolsky-joel@CS.YALE.EDU (Joel Spolsky) Organization: Yale University Computer Science Dept, New Haven CT 06520-2158 Lines: 35 In article <1413@neoucom.UUCP> wtm@neoucom.UUCP (Bill Mayhew) writes: | | I don't know what GSARC uses for its coding scheme. If it uses | huffman coding to compress ASCII files, the output can be bigger | than the input on a short file. Don't be absurd. Huffman encoding went out with pet rocks :-) | Huffman coding uses a variable | number of bits to encode characters based upon the frequency with | which characters appear. Characters that appear frequently are | encoded with short bit patters, which infrequently encountered | characters are longer. A table is prepended to such a file so that | the decoding algorithm knows which is which. OK, you get an A+ in CS100 :-) | On a short file where all characters appear with about the same | frequency, huffman coding is inefficient. You are also penalized | by the fact that the lookup table takes some space. Well, I guess that's why nobody uses Huffman encoding :-) | Unix compress, for instance, uses huffman coding. False. Unix compress uses Ziv-Lempel-Welch encoding, which achieves much higher compression rates than (snicker) Huffman encoding. ARC also uses ZLW. GSARC uses a modified form of ZLW with variable length codes, which makes GSARC perform better on very short files. +----------------+----------------------------------------------------------+ | Joel Spolsky | bitnet: spolsky@yalecs.bitnet uucp: ...!yale!spolsky | | | internet: spolsky@cs.yale.edu voicenet: 203-436-1483 | +----------------+----------------------------------------------------------+ #include