Xref: utzoo alt.comp.compression:186 comp.compression:101 Newsgroups: alt.comp.compression,comp.compression Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!sarah!bingnews!kym From: kym@bingvaxu.cc.binghamton.edu (R. Kym Horsell) Subject: Re: Trying to get maximum compression Message-ID: <1991Mar27.132216.8132@bingvaxu.cc.binghamton.edu> Organization: State University of New York at Binghamton References: <1991Mar24.152106.6333@pegasus.com> Date: Wed, 27 Mar 1991 13:22:16 GMT In article philip@beeblebrox.dle.dg.com (Philip Gladstone) writes: >>>>>> On 24 Mar 91 15:21:06 GMT, shaw@pegasus.com (Sandy Shaw) said: >Sandy> I am trying to get the maximum compression possible of the following >Sandy> 32 byte hex file(in ASC): >Sandy> f3e9 ec5c 8bec ecdb ece9 ec12 ec3f ecec >Sandy> 0cbb 8bec 5cdb ecdb 5c9c bbec 8bdb 9cec >Sandy> The best results I get are using the compact utility(adaptive Huffman). This >Sandy> gives 21.88% compression or 7 bytes saved. Does anyone out there have a >Sandy> scheme that can improve on this? >Yes: >I can compress this file down to 1 byte (one bit really) with the following >algorithm ... While I must have reservations similar to Philip regarding coming up with an encoding of a _single example_ rather than a _class_ of strings, it _is_ a fun intellectual exercise. To get the ball rolling (a bit faster), I've managed a compression of 30% using the appended program (provided you don't count the program as part of the encoding! :-). That is, almost 10 bytes are saved (9.4 to be exact). Tony Warnock at lanl has pointed out to me there are algorithms that turn bitstrings into shift-register machines. Essentially such machines require a very small number of parameters and might prove useful. -kym BTW, LZ can't compress the output file further, although this would seem simple to do. ====================program==================== #include #include int bytes[32]; int nbr; int nbw; int pos; int lastpos; outnum(x) { putchar('0'+(x&1)), nbw++; x>>=1; putchar('0'+(x&1)), nbw++; x>>=1; if(x) putchar('0'+(x&1)), nbw++; } outbit(b){ if(b&1) outnum(pos-lastpos),lastpos=pos+1; ++pos; } main(){ int i,j; nbr=0; nbw=0; pos=0; lastpos=0; for(i=0;i<32;i++)scanf("%2x",&bytes[i]),nbr+=8; for(i=0;i<32;i++)printf("%x ",bytes[i]); putchar('\n'); for(i=0;i<32;i++){ unsigned diff = bytes[i]^0xec; outbit(diff); diff>>=1; outbit(diff); diff>>=1; outbit(diff); diff>>=1; outbit(diff); diff>>=1; outbit(diff); diff>>=1; outbit(diff); diff>>=1; outbit(diff); diff>>=1; outbit(diff); } putchar('\n'); printf("%d bits read; %d bits written; compression=%g\n", nbr,nbw,(double)nbw/nbr ); exit(0); } ====================log file==================== f3 e9 ec 5c 8b ec ec db ec e9 ec 12 ec 3f ec ec c bb 8b ec 5c db ec db 5c 9c bb ec 8b db 9c ec 000000000011101000010000000010010000001000010100110000000000000000001100010100000000001010100000010010100100000001000010000010000110010001000010000010101000000010010000010000110000 256 bits read; 180 bits written; compression=0.703125 ====================end end end====================