Xref: utzoo comp.sys.amiga.datacomm:141 alt.flame:27705 Path: utzoo!utgpu!cs.utexas.edu!sun-barr!ames!vsi1!zorch!xanthian From: xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) Newsgroups: comp.sys.amiga.datacomm,alt.flame Subject: Re: A more memory efficient compress Message-ID: <1991Jan28.190108.13993@zorch.SF-Bay.ORG> Date: 28 Jan 91 19:01:08 GMT References: <20680@know.pws.bull.com> Followup-To: comp.sys.amiga.datacomm Organization: SF-Bay Public-Access Unix Lines: 62 C506634@UMCVMB.MISSOURI.EDU (Eric Edwards) writes: > Does such a beast exist? The current version of compress (4.0) still > wants a big chunk of contiguous ram. On my system, the largest block > available after I boot up and start a shell is 372k. This is still not > enough! > Surely if lharc and Zip can run under such conditions under the same > conditions using the same compression algorithm, compress ought to be > able to. Actually, I wouldn't really mind if compress took 550k to > run, just so long as it doesn't have to be contiguous. > So. Any pointers? Well, I used to compress files under Unix and uncompress them on my 512K A1000 all the time, the secret is the "-b" flag. When compressing _for_ a small memory system, or when compressing _on_ a small memory system, use "-b14", "-b13", or even "-b12", until you get a size that works. I'm a bit fuzzy on this, but I think the ## in "-b##" is the power of two size of the look-back buffer that compress uses to find strings it already knows that it can point to instead of copying to the output. Obviously, the bigger the look-back buffer, the better chance of finding a really long string match, and so the better the potential compression. As a result, designing a _big_ buffer into compress is a Good Thing. However, it turns out that even "-b12" is pretty efficient compared to the default "-b16" on a Unix system or "-b14" in the Amiga implementation I use. If you get a file compressed by someone _else_, your best bet is to do an uncompress, compress -b14 on your host site before transferring the data down, just to be on the safe side. By the way, you've mostly identified the problem you're having: the Amiga memory is "hunky" from memory manager fragmentation, while a Unix process gets its own clean 16M of virtual memory in which to allocate it's work buffers; naturally compress, being a Unix utility designed for speed, doesn't take into account that the look-back buffer might need to be allocated as a link list of contiguous parts, slowing down access and compression speed a lot. Better to use the "-b12" flag than to rewrite compress to run more slowly. Oh, yeah, the "-b" flag isn't needed to uncompress the data, the buffer size is a header element in the compressed data file. As to lharc and zip, I don't know whether they inherently use smaller buffers than the compress default (probably, though, since both had their origins within MS-DOS's 640K address space), though obviously they have to use buffers at least as big as the ones on the system that created the archive you are unpacking. It is alternately possible (much less likely) that they know how to do "hunky" buffers. In general, your "using the same algorithm" ignores the fact that the compress algorithm has a scaling factor controlled by the "-b" flag, and so is really a family of algorithms with different buffer needs, and that's where the magic is that makes things work. Kent, the man from xanth.