Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!cis.ohio-state.edu!zaphod.mps.ohio-state.edu!rpi!uupsi!sunic!kth.se!cyklop.nada.kth.se!news
From: d88-jwa@byse.nada.kth.se (Jon W{tte)
Newsgroups: comp.compression
Subject: Re: How StuffIt works
Message-ID: <D88-JWA.91Jun28193953@byse.nada.kth.se>
Date: 28 Jun 91 17:39:57 GMT
References: <8824@jhunix.HCF.JHU.EDU>
Sender: news@nada.kth.se (Mr News)
Organization: Royal Institute of Technology, Stockholm, Sweden
Lines: 28
In-reply-to: hsu_wh@jhunix.HCF.JHU.EDU's message of 26 Jun 91 01:12:59 GMT

hsu_wh@jhunix.HCF.JHU.EDU (William H Hsu) writes:

	   First of all, why is Stuffit so relatively inefficient?  I can't see

	   According to the October 1990 MacTutor article on LZW on the Mac,
   Stuffit uses the UNIX compress 14-bit scheme.  From the graph recently

Well, 14 bit is less than UNIX compress uses - it has a 16 bit hash table
(which takes half a meg !) StuffIt may use the same algo, but you have to
spend some memory to make it behave, too !

	   Another question: how does Stuffit (and other compression programs),
   in Ray Lau's words, "determine the characteristics of the input data"?  Can

StuffIt tries several different methods, and chooses the one that works
best. Trying HuffMan doesn't imply actually doing the coding, just collecting
frequencies and building the tree is enough to know how large the coding
will be.

Getting the data "type" from a set of data is shakey, but you can set
some rules to see what characterizes different sets of data. Some formats
include magic numbers/headers which can be recognized. On the mac this is
easy, since it has a typed file system; the info's already there.

--
						Jon W{tte
						h+@nada.kth.se
						- Speed !