Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site cca.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!cca!g-rh
From: g-rh@cca.UUCP (Richard Harter)
Newsgroups: net.lang.c,net.unix
Subject: Re: Re: Compaction Algorithm (pack vs compress)
Message-ID: <6448@cca.UUCP>
Date: Fri, 28-Feb-86 23:06:37 EST
Article-I.D.: cca.6448
Posted: Fri Feb 28 23:06:37 1986
Date-Received: Sat, 1-Mar-86 22:39:45 EST
References: <207@pierce.UUCP> <3261@sun.uucp> <>
Reply-To: g-rh@cca.UUCP (Richard Harter)
Organization: Computer Corp. of America, Cambridge
Lines: 19
Xref: watmath net.lang.c:8020 net.unix:7247
Summary: 

Just a note on packing.  Recently people in one of the newsgroups were
speculating about one character predictor packing.  We implemented it
and found that it works as well as pack (for text files).  The idea is
that you make a first pass through the file and determine, for each
character, its most probable successor.  You make a second pass through
the file and record one bit for each character, F if the character is not
followed by its most probable successor, and T if it is.  You also record
the incorrect guesses.  You store the file as a bit array followed by the
wrongly predicted characters.  The point of these shenanigans is that the
unpacking is fast.  In our context the determination of the predicted
characters only has to be done once so packing is also fast.  Our results
are that this style of packing is as efficient as PACK for source code
and English text.  The bit array is an automatic 12.5% penalty.  Prediction
rate is around 50% for source code (depending on language and style --
higher for operating systems that store files with trailing blanks in
fixed records).  Not bad for quick and dirty.  However it is of no use
for object code.

	Richard Harter, SMDS Inc.