Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!cis.ohio-state.edu!zaphod.mps.ohio-state.edu!rpi!uupsi!sunic!kth.se!cyklop.nada.kth.se!news From: d88-jwa@byse.nada.kth.se (Jon W{tte) Newsgroups: comp.compression Subject: Re: How StuffIt works Message-ID: Date: 28 Jun 91 17:39:57 GMT References: <8824@jhunix.HCF.JHU.EDU> Sender: news@nada.kth.se (Mr News) Organization: Royal Institute of Technology, Stockholm, Sweden Lines: 28 In-reply-to: hsu_wh@jhunix.HCF.JHU.EDU's message of 26 Jun 91 01:12:59 GMT hsu_wh@jhunix.HCF.JHU.EDU (William H Hsu) writes: First of all, why is Stuffit so relatively inefficient? I can't see According to the October 1990 MacTutor article on LZW on the Mac, Stuffit uses the UNIX compress 14-bit scheme. From the graph recently Well, 14 bit is less than UNIX compress uses - it has a 16 bit hash table (which takes half a meg !) StuffIt may use the same algo, but you have to spend some memory to make it behave, too ! Another question: how does Stuffit (and other compression programs), in Ray Lau's words, "determine the characteristics of the input data"? Can StuffIt tries several different methods, and chooses the one that works best. Trying HuffMan doesn't imply actually doing the coding, just collecting frequencies and building the tree is enough to know how large the coding will be. Getting the data "type" from a set of data is shakey, but you can set some rules to see what characterizes different sets of data. Some formats include magic numbers/headers which can be recognized. On the mac this is easy, since it has a typed file system; the info's already there. -- Jon W{tte h+@nada.kth.se - Speed !