Path: utzoo!attcan!uunet!husc6!mailrus!uflorida!gatech!emcard!stiatl!john
From: john@stiatl.UUCP (John DeArmond)
Newsgroups: comp.binaries.ibm.pc.d
Subject: Re: Is uncompression faster than disk I/O?
Message-ID: <2863@stiatl.UUCP>
Date: 23 Jan 89 07:28:26 GMT
References: <14227@princeton.Princeton.EDU> <929@novavax.UUCP>
Reply-To: john@stiatl.UUCP (John DeArmond)
Organization: Sales Technologies Inc., Atlanta, GA
Lines: 59

In article <929@novavax.UUCP> nanook@novavax.UUCP (Keith Dickinson) writes:
>in article <14227@princeton.Princeton.EDU>, nr@notecnirp.Princeton.EDU (Norman Ramsey) says:
>> Someone suggested to me that it might pay off to store my data files
>> in compressed format, then uncompress them when I get ready to use
>> them.  
>> 
>> Norman Ramsey
>> nr@princeton.edu
>
>Norman. If you were running your software off of floppy, I'd say it's possible.
>But even then I'd suggest that you find some way to obtain either the quick
>compress routines from PKPAK or fron Sea (Arc). 
>
>You inherant problem is that data compression takes up more cpu cycles than
>writing the file probably ever could. If your writing to a Hard disk. I'd say
>that there was NO way you could compress faster than you could write.
>
It really does not matter that data compression takes more machine 
cycles than writes.  What does matter is whether or not a given
block of data can be compressed during the interval a program would
wait for disk I/O.

A concrete example:

About a year ago a friend and I wrote a high resolution Mandelbrot map
generator.  This program calculates maps to 8 bit resolution and thus
stores a single pixel per byte.  An EGA resolution map occupies about
220k bytes.  Aside from the obvious hassles and waste in storing
and transmitting such maps, the load time into the display program
is significant.

I implemented a simple RLL-based compression.  Depending on the 
complexity of the map, the data file is reduced in size from
50% to a factor of 5 or more.

I've found that on my Compaq 386, it is MUCH faster to decompress
on the fly than to read the raw data.  The CPU is so much faster than
the I/O system that there is really no contest.  It's interesting to 
note that even with the image file totally in disk cache space,
the uncompress program run faster than the program that uses raw
pixel maps.  The difference is not large but nontheless significant.


So Norman, the answer to your question is - IT Depends!  The
general purpose algorithms such as LZW (ARC, compress, etc) and
Huffman do a pretty good job in the general sense but you can do
much better if you can exploit some characteristic of your data
set.  KISS principles apply fully here.  Many times a very 
simple algorithm will achieve a high fraction of more complicated
routines but with a vastly smaller implementation and execution
time.  So, take your compiler in one hand and an editor in the 
other and EXPERIMENT!

John

-- 
John De Armond, WD4OQC                     | "I can't drive 85!"
Sales Technologies, Inc.    Atlanta, GA    | Sammy Hagar driving 
...!gatech!stiatl!john                     | thru Atlanta!