Path: utzoo!attcan!uunet!husc6!mailrus!uflorida!gatech!emcard!stiatl!john From: john@stiatl.UUCP (John DeArmond) Newsgroups: comp.binaries.ibm.pc.d Subject: Re: Is uncompression faster than disk I/O? Message-ID: <2863@stiatl.UUCP> Date: 23 Jan 89 07:28:26 GMT References: <14227@princeton.Princeton.EDU> <929@novavax.UUCP> Reply-To: john@stiatl.UUCP (John DeArmond) Organization: Sales Technologies Inc., Atlanta, GA Lines: 59 In article <929@novavax.UUCP> nanook@novavax.UUCP (Keith Dickinson) writes: >in article <14227@princeton.Princeton.EDU>, nr@notecnirp.Princeton.EDU (Norman Ramsey) says: >> Someone suggested to me that it might pay off to store my data files >> in compressed format, then uncompress them when I get ready to use >> them. >> >> Norman Ramsey >> nr@princeton.edu > >Norman. If you were running your software off of floppy, I'd say it's possible. >But even then I'd suggest that you find some way to obtain either the quick >compress routines from PKPAK or fron Sea (Arc). > >You inherant problem is that data compression takes up more cpu cycles than >writing the file probably ever could. If your writing to a Hard disk. I'd say >that there was NO way you could compress faster than you could write. > It really does not matter that data compression takes more machine cycles than writes. What does matter is whether or not a given block of data can be compressed during the interval a program would wait for disk I/O. A concrete example: About a year ago a friend and I wrote a high resolution Mandelbrot map generator. This program calculates maps to 8 bit resolution and thus stores a single pixel per byte. An EGA resolution map occupies about 220k bytes. Aside from the obvious hassles and waste in storing and transmitting such maps, the load time into the display program is significant. I implemented a simple RLL-based compression. Depending on the complexity of the map, the data file is reduced in size from 50% to a factor of 5 or more. I've found that on my Compaq 386, it is MUCH faster to decompress on the fly than to read the raw data. The CPU is so much faster than the I/O system that there is really no contest. It's interesting to note that even with the image file totally in disk cache space, the uncompress program run faster than the program that uses raw pixel maps. The difference is not large but nontheless significant. So Norman, the answer to your question is - IT Depends! The general purpose algorithms such as LZW (ARC, compress, etc) and Huffman do a pretty good job in the general sense but you can do much better if you can exploit some characteristic of your data set. KISS principles apply fully here. Many times a very simple algorithm will achieve a high fraction of more complicated routines but with a vastly smaller implementation and execution time. So, take your compiler in one hand and an editor in the other and EXPERIMENT! John -- John De Armond, WD4OQC | "I can't drive 85!" Sales Technologies, Inc. Atlanta, GA | Sammy Hagar driving ...!gatech!stiatl!john | thru Atlanta!