Newsgroups: comp.compression Path: utzoo!utgpu!watserv1!watmath!looking!brad From: brad@looking.on.ca (Brad Templeton) Subject: Re: Modeling vs encoding (Re: Lempel-Ziv v/s huffman encoding) Organization: Looking Glass Software Ltd. Date: Tue, 02 Apr 91 05:50:23 GMT Message-ID: <1991Apr02.055023.27834@looking.on.ca> References: <1991Mar31.000653.4598@zorch.SF-Bay.ORG> <12546@pt.cs.cmu.edu> <1991Apr01.053616.3665@looking.on.ca> <1991Apr1.191751.4211@nntp-server.caltech.edu> If PKZIP does do that, I am at a loss to explain why it sends the full tree in the file (unless it was really forward looking, which by and large it was not) but more to the point, why it builds a complete temp file for the file and does a full two pass compress? If you heard it from Katz himself, I believe it, but it seems odd that it does the full two passes. My two-pass compressor has a mode where it will scan the first N bytes of the file and build tables on those, doing one pass compress from then on. (You need such a mode to compress arbitrary sized stdin -- or you can use the "retain huffer" mode which outputs new tables every N bytes for this) This semi-one pass mode is only about one or two percent worse on a typical large file, which is not bad. On large files that vary in data composition -- such as general tar files, etc, the retraining method can give the best compression. BTW, I am looking for a good name for my compressor. Current name is "ISP" for the "Incredible Shrinking Program" but I think it sounds a bit wimpy, and too much like LISP. Other choices: ZAP - but it may be too close to ZIP Scrunch/Scrinch YACS - (Yet another compression synonym) Pressor - the name of its inner subroutine What is your pick? -- Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473