Xref: utzoo comp.compression:33 alt.comp.compression:166 Newsgroups: comp.compression,alt.comp.compression Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!sarah!bingnews!kym From: kym@bingvaxu.cc.binghamton.edu (R. Kym Horsell) Subject: Re: theoretical compression factor Message-ID: <1991Mar25.054838.15588@bingvaxu.cc.binghamton.edu> Keywords: random Organization: State University of New York at Binghamton References: <1991Mar25.031214.25696@bingvaxu.cc.binghamton.edu> Date: Mon, 25 Mar 1991 05:48:38 GMT I found one little problem with my algebra & understanding. Delta modulation takes N bits into O(N) bits, NOT less. The theoretical compression factor for my binary strings example would therefore be 2p(1-p) where p is the proportion of 0's (or 1's -- it doesn't matter). For /usr/dict/words the proportion of 1's is about 56% (only considering the lsb 7 bits of each char). This should give a compression factor of about 49% with the method outlined. Tables of random ascii digits contain about .38 1's (taking only lsb 7 bits) which leads to a similar .47 compression factor. If we massage digit strings to 4-bit groups (by subtracting the '0') we find only about 6% 1's and therefore the compression factor would be .1 (i.e. a 1Mb file of random digits -> 100Kb). Again, if there are any glaring errors, please comment. -kym