Xref: utzoo alt.comp.compression:161 comp.compression:21
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!swrinde!cs.utexas.edu!uwm.edu!convex.csd.uwm.edu!anthony
From: anthony@convex.csd.uwm.edu (Anthony J Stieber)
Newsgroups: alt.comp.compression,comp.compression
Subject: Re: Trying to get maximum compression
Message-ID: <10484@uwm.edu>
Date: 25 Mar 91 02:02:57 GMT
References: <1991Mar24.152106.6333@pegasus.com>
Sender: news@uwm.edu
Followup-To: alt.comp.compression
Organization: University of Wisconsin - Milwaukee
Lines: 30

In article <1991Mar24.152106.6333@pegasus.com> shaw@pegasus.com (Sandy Shaw) writes:
>I am trying to get the maximum compression possible of the following
>32 byte hex file(in ASC): 
>
>f3e9 ec5c 8bec ecdb ece9 ec12 ec3f ecec
>0cbb 8bec 5cdb ecdb 5c9c bbec 8bdb 9cec
>

First do some simple analysis of the data.  There are 32 bytes, of
these there are 11 unique bytes.  Replace each data byte with a 4 bit
pointer to one of the 11 bytes.  Now instead of 256 (32*8) bits
uncompressed data, or 200 (25) Hufmann compressed bits, there are 216
(11*8+32*4) bits, which isn't as good.  Now take the table and compute
the differences between the 11 bytes, starting with zero.  There are 11
differences, each difference is less than 64 so they can be stored as 6
bit values. Thus the table can be compressed from 88 (11*8) bits down
to 66 (11*6).  Total bits is 194 (128+66).

This compression is of course totally dependant on the distribution of
the data, and is probably worthless in real life situations.
Interestingly enough there are 11 unique bytes and 11 unique nybbles.
As it happens even if there were only 4 unique nybbles that would still
take 128 (64*2) bits to encode.  Thus a nybble table would not
generally be more efficient.

However, what's the point?  The code to do any sort of uncompression like
this is going to be much bigger than the data.  What are you actually
trying to do?
--
<-:(= Anthony Stieber	anthony@csd4.csd.uwm.edu   uwm!uwmcsd4!anthony