Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!zaphod.mps.ohio-state.edu!caen!umich!gumby!wmu-coyote!campbell
From: campbell@sol.cs.wmich.edu (Paul Campbell)
Newsgroups: comp.compression
Subject: Re: Is there better lossless compression than LZW?
Message-ID: <1991May30.134838.6653@sol.cs.wmich.edu>
Date: 30 May 91 13:48:38 GMT
References: <2722@pdxgate.UUCP> <1991May28.231022.11197@unislc.uucp> <1991May29.155853.13716@looking.on.ca>
Reply-To: campbell@coyote.cs.wmich.edu
Organization: Western Michigan Univ. Comp. Sci. Dept.
Lines: 13

Please note the reply line. It mismarks my postings here. I'm suggesting that
if you extend the default dictionary (which ordinarily contains just the
initial character set) to contain other things, and that you find some way to
find things which would work 'best' in this dictionary, since you're now going
to have to store at least part of that dictionary as a preamble to the LZ
compressed file, there should be a way to gain at least some savings. For
instance, store any string which is repeated over x% of the document (yep, you
will have to pre-scan the document first to find these). Another example was
where a friend and I were looking at adaptive and fixed Arithmetic encoding.
We tried making a better method by first doing a frequency scan of the incoming
text, which would generate the necessary data for a good fixed model. Then we
went ahead and did an adaptive compression for a small savings, since it had
a very good initial weighting system to start with.