Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!samsung!uunet!ns-mx!pyrite.cs.uiowa.edu
From: jones@pyrite.cs.uiowa.edu (Douglas W. Jones,201H MLH,3193350740,3193382879)
Newsgroups: comp.compression
Subject: Re: Modeling vs encoding (Re: Lempel-Ziv v/s huffman encoding)
Message-ID: <5173@ns-mx.uiowa.edu>
Date: 1 Apr 91 15:45:56 GMT
References: <12546@pt.cs.cmu.edu>
Sender: news@ns-mx.uiowa.edu
Lines: 28

From article <12546@pt.cs.cmu.edu>, by tgl@g.gp.cs.cmu.edu (Tom Lane):
> 
> A fully adaptive model starts from innocuous assumptions (say, all symbols
> equally probable).

Since the compresser and the expander must be primed with identical
assumptions about the source alphabet, the initial assumptions may
be used as a key for encryption (I first made this comment in my CACM
article on splay-tree based encryption, cited in the recent bibliographic
posting on this newsgroup).

> 
> A zeroth-order model just remembers the (independent) probability of each
> symbol, and codes accordingly.
> 
> A first-order model keeps track of the probability of each symbol *given the
> previous symbol*.

And as I commented in the same CACM article, fractional orders are quite
practical.  A less-than-first-order model that is better than zeroth-order
doesn't remember the previous symbol, but it remembers something about it.
For example, remembering if the previous symbol was a letter, space, or
punctuation, or remembering the least-significant few bits of the character
code.  As the data in my CACM article shows, retaining a small amount of
context information can give a compressor a big boost in performance.

				Doug Jones
				jones@cs.uiowa.edu