Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!caen!kuhub.cc.ukans.edu!maverick.ksu.ksu.edu!iowasp.physics.uiowa.edu!ns-mx!pyrite.cs.uiowa.edu From: jones@pyrite.cs.uiowa.edu (Douglas W. Jones,201H MLH,3193350740,3193382879) Newsgroups: comp.compression Subject: Re: theoretical compression factor Message-ID: <5239@ns-mx.uiowa.edu> Date: 3 Apr 91 19:46:35 GMT References: <46618@ut-emx.uucp> Sender: news@ns-mx.uiowa.edu Lines: 36 From article <46618@ut-emx.uucp>, by readdm@ccwf.cc.utexas.edu (David M. Read): >>Is p lg p a hard-and-fast bound or not? I still think it's an average. > > You can get arbitrarily close to that sum, but you may not go under > it, unless you *increase* entropy somewhere else. Right, but, speaking as a Computer Scientist with a degree in Physics ... Information theoretic entropy is mathematically identical to quantum mechanical entropy. I think that this is one of the biggest surprises in 20th century science, given that Shannon didn't set out to apply statistical thermodynamics to information flow problems. Once he recognized that the mathematics was the same, he was quite happy to borrow names like entropy for the information theoretic quantities that he needed to name. Back to the subject, the entropy of a message, is a hard and fast bound on the compression only in a statistical sense. If your compression algorithm compresses some message to fewer bits than the entropy indicates, then your algorithm must compress some other message to more than p log p bits. This is the tradeoff that parallels the second law of thermodynamics. In information theory, the entropy of a message can only be measured relative to a statistical model of the source. For example, we can use observed letter frequencies to build a model for English, but having done so, we have no guarantee that the language used from that point on will correspond to the frequencies we measured. Also, better source models, for example, letter pair frequencies or word frequencies, lead to lower entropy values. Doug Jones jones@cs.uiowa.edu