Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!uunet!ginosko!usc!apple!oliveb!tymix!tardis!jms From: jms@tardis.Tymnet.COM (Joe Smith) Newsgroups: comp.sys.amiga.tech Subject: Re: huffman encoding Summary: Please explain what you're asking Message-ID: <664@tardis.Tymnet.COM> Date: 5 Oct 89 00:36:58 GMT References: <477@crash.cts.com> Reply-To: jms@tardis.Tymnet.COM (Joe Smith) Organization: McDonnell Douglas Field Service Co, San Jose CA Lines: 43 In article <477@crash.cts.com> uzun@pnet01.cts.com (Roger Uzun) writes: >What I was saying was that the bit patterns/bit lengths of the codes used >to represent the tokens in a file where there are 256 unique tokens is a >constant and will work for all files. ^^^^^^^ No, it's not. If you have 8-bit bytes (256 possible tokens), they can be encoded as 3 bits each, if only 8 of the possible 256 tokens are actually used and have equal probability. If one token is used 99% of the time and the remaining 255 tokens have equal distribution, then huffman encoding will use a single bit for the majority token, and 9 bits for all the rest. There's no way you can call that a constant. After reading all the previous followups, I *still* don't understand what you are asking for. Here is my guess as to you are asking: Q1) Given a file that is several thousand bytes long and uses 8-bit bytes, what is the best huffman tree for encoding it? A1) Using a "constant tree" will never be optimal, you have to analyze the data, count the number of times each character occurs, (such as "how many A's, how many B's, etc) and then build the tree. Q2) If the file is very short, such as 256 bytes or less in length, can a constant tree be used? A2) I say "no", you still have to analyze the data. I'd say the table you are asking for cannot exist, by definition. >It is true that the bit patterns and bit lengths used to represent tokens >varies depending on # of unique tokens in a file, but if that # is assumed >to be 256, you have no problem. Non sequitur. I understand that at all. -- Joe Smith (408)922-6220 | SMTP: JMS@F74.TYMNET.COM or jms@tymix.tymnet.com McDonnell Douglas FSCO | UUCP: ...!{ames,pyramid}!oliveb!tymix!tardis!jms PO Box 49019, MS-D21 | PDP-10 support: My car's license plate is "POPJ P," San Jose, CA 95161-9019 | narrator.device: "I didn't say that, my Amiga did!"