Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site cca.UUCP Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!cca!g-rh From: g-rh@cca.UUCP (Richard Harter) Newsgroups: net.lang.c,net.unix Subject: Re: Re: Compaction Algorithm (pack vs compress) Message-ID: <6448@cca.UUCP> Date: Fri, 28-Feb-86 23:06:37 EST Article-I.D.: cca.6448 Posted: Fri Feb 28 23:06:37 1986 Date-Received: Sat, 1-Mar-86 22:39:45 EST References: <207@pierce.UUCP> <3261@sun.uucp> <> Reply-To: g-rh@cca.UUCP (Richard Harter) Organization: Computer Corp. of America, Cambridge Lines: 19 Xref: watmath net.lang.c:8020 net.unix:7247 Summary: Just a note on packing. Recently people in one of the newsgroups were speculating about one character predictor packing. We implemented it and found that it works as well as pack (for text files). The idea is that you make a first pass through the file and determine, for each character, its most probable successor. You make a second pass through the file and record one bit for each character, F if the character is not followed by its most probable successor, and T if it is. You also record the incorrect guesses. You store the file as a bit array followed by the wrongly predicted characters. The point of these shenanigans is that the unpacking is fast. In our context the determination of the predicted characters only has to be done once so packing is also fast. Our results are that this style of packing is as efficient as PACK for source code and English text. The bit array is an automatic 12.5% penalty. Prediction rate is around 50% for source code (depending on language and style -- higher for operating systems that store files with trailing blanks in fixed records). Not bad for quick and dirty. However it is of no use for object code. Richard Harter, SMDS Inc.