Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!genrad!panda!talcott!harvard!seismo!rochester!ritcv!mjl From: mjl@ritcv.UUCP (Mike Lutz) Newsgroups: net.unix Subject: Re: Compaction Algorithm (pack vs compress; block savings) Message-ID: <9410@ritcv.UUCP> Date: Sat, 1-Mar-86 13:35:38 EST Article-I.D.: ritcv.9410 Posted: Sat Mar 1 13:35:38 1986 Date-Received: Mon, 3-Mar-86 00:36:56 EST References: <207@pierce.UUCP> <3261@sun.uucp> <226@uvacs.UUCP> Reply-To: mjl@ritcv.UUCP (Michael Lutz) Distribution: net Organization: Rochester Institute of Technology, Rochester, NY Lines: 51 Keywords: pack compress compact In article <226@uvacs.UUCP> rwl@uvacs.UUCP (Ray Lubinsky) writes: >I don't know about your system, but my Vax running 4.2 BSD permits >internal fragmentation, so it's disk block savings that count. Agreed that block savings are what we want, but wait a bit: >Now, I'm not entirely familiar with ``compress'', but I can compare with >``compact''. When I created a file of 2000 bytes (one identical character >per line plus a newline), ``compress'' boasted of > 85% compression, while >pack only claimed 50% compression, but each of the results consumed the same >amount of blocks. Hence the same effective compression. So who goes about compressing 2K files as a matter of course? With a 1K/8K file system, your experiment is limited by the granularity of the file allocation to the point where the results are meaningless. You're limited to only 3 possible outcomes: 0 blocks (if there is no information content; highly unlikely). 1 block (if compression gets the byte size down to 1 - 1024 bytes). 2 blocks (if compression gets the byte size down to 1025-2000 bytes). Basically, you get either 50% compression or 0% compression, no matter what algorithm you use. Let's look at a binary file with more meat: libc.a The following list shows the size in blocks of our 4.2 libc.a, and the results of compressing it using compress, pack, and compact (the latter being a notable CPU hog): 112 libc.a 80 libc.a.pack 80 libc.a.compact 64 libc.a.compress Note that libc.a.compress saves an additional 16 blocks over the 2 Huffman based programs. Given that compress takes only about 24% more CPU time than pack, and the compressed file requires only 80% of the space of the packed file, I choose compress with no hesitation for archival applications, (which is why we compress, not pack, our {net,mod}.sources archives). * * * * * On a related note, try encrypting a file (using crypt(1)) followed by your favorite compression filter, and watch the file expand. Not surprising, as crypt spreads the information content evenly over the entire file, but interesting nonetheless. Of course, compression followed by encryption does not suffer from this problem. -- Mike Lutz Rochester Institute of Technology, Rochester NY UUCP: {allegra,seismo}!rochester!ritcv!mjl CSNET: mjl%rit@csnet-relay.ARPA