Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!elroy.jpl.nasa.gov!ncar!csn!boulder!spot.Colorado.EDU!weverka From: weverka@spot.Colorado.EDU (Robert T. Weverka) Newsgroups: comp.compression Subject: Re: theoretical compression factor Message-ID: <1991Apr2.122910.21829@colorado.edu> Date: 2 Apr 91 12:29:10 GMT References: <1991Mar29.031127.9128@bingvaxu.cc.binghamton.edu> <20144@alice.att.com> <1991Apr2.034441.28170@bingvaxu.cc.binghamton.edu> Sender: news@colorado.edu (The Daily Planet) Organization: University of Colorado, Boulder Lines: 22 Nntp-Posting-Host: spot.colorado.edu In article <1991Apr2.034441.28170@bingvaxu.cc.binghamton.edu> kym@bingvaxu.cc.binghamton.edu (R. Kym Horsell) writes: ...heavily edited... >The little program I used previously to demonstrate that `p lg p' could be >beaten by an historyless technique does not, to my mind, build any model of >the data. > >In article <20135@alice.att.com> jj@alice.att.com (jj, like it or not) writes: >>First, the question of "how much can I compress X" must >>be qualified. If you have a historyless compression, >>you can compress up to the entropy ( sum of -p log p) bound, > ^^^^^^ >>where p is the probability of each token. > >Is p lg p a hard-and-fast bound or not? I still think it's an average. > >-kym It is an Expectation value. log(1/p) is the number of bits required for each symbol. The Expectation value of this for different probabilities is E(log(1/pi)) (Read pi as p sub i) E(log(1/pi))= ( sum of -p log p)