Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!gem.mps.ohio-state.edu!uakari.primate.wisc.edu!polyslo!sdsu!crash!pnet01!uzun
From: uzun@pnet01.cts.com (Roger Uzun)
Newsgroups: comp.sys.amiga.tech
Subject: Re: huffman encoding
Message-ID: <486@crash.cts.com>
Date: 5 Oct 89 14:56:29 GMT
Sender: root@crash.cts.com
Organization: People-Net [pnet01], El Cajon CA
Lines: 25

I had misunderstood huffman encoding then.  I thought that regardless of
the actual frequency of a token in a file, the bit pattern and bit length
used to encode that token was dependent only on its relative frequency to
other tokens in the file.  Say file A has 10 unique tokens in it and
the most probable token occurs 90% of the time.  File B also has 10 unqie
tokens in it and the most probable token occurs 30% of the time.  It was
my understanding of huffman that the bit patterns used to encode the
most probable token in each of these cases would be the same. 
Apparently you are saying this is not true, perhaps I am describing
a "subset" of huffman encoding that would use constant trees, and
of course be less efficient.  Basically in my scheme of things any file
that has all 256 possible bytes represented in it (the file contains
256 unique tokens) would have the same bit patterns/bit lengths used
to encode its tokens.  The tokens associated with the bit patterns/bit
lengths would vary depending on their relative probabilities in that one
file, but the encoding bit patterns/bit lengths would remain the same
between the 2 files.
HMMM I thought huffman only considered the RELATIVE probabilities of
tokens in a file, perhaps you are thinking of a more advanced algorithm
based on huffman encoding.
-Roger Uzun

UUCP: {hplabs!hp-sdd ucsd nosc}!crash!pnet01!uzun
ARPA: crash!pnet01!uzun@nosc.mil
INET: uzun@pnet01.cts.com