Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!sdd.hp.com!mips!prls!pyramid!octopus!vsi1!zorch!xanthian From: xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) Newsgroups: alt.sources.d Subject: Re: Compressing alt.sex.pictures files Message-ID: <1990Aug30.090711.13185@zorch.SF-Bay.ORG> Date: 30 Aug 90 09:07:11 GMT References: <1990Aug27.154419.25882@mcs.anl.gov> <1990Aug28.192024.22435@zorch.SF-Bay.ORG> <6467@sugar.hackercorp.com> Organization: SF Bay Public-Access Unix Lines: 36 peter@sugar.hackercorp.com (Peter da Silva) writes: > xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes: >> Hmmm. Good idea. Here's one suggestion for an experiment. Somewhere on >> the net, a month or two back, there was a suggestion that a good way to >> compress realistic images was to XOR scan lines after the first with their >> predecessor line, then LZW compress the output. > >I believe that XORing alternate scan-lines and run-length-encoding the result >would do a better job. I've heard of this algorithm being pretty effective >in past discussions. Should one do this with the image bitplane-mapped or >pixel-mapped? Not to suggest not running the experiment, but run length encoding (RLE) suffers from the same problem as LZ compression: the output data of the XOR is "speckley", with (sparse, one hopes) one bits scattered essentially at random in a long string of zero bits. This is not the kind of data on which RLE does well: long strings of one pattern followed by long strings of another. Comparitively, (taking 8 bit pixels as an easy example), the number of byte bit patterns with zero or one one-bit set is 9/256ths of the total patterns. If a large part of the output bytes of the XOR fall in this set, then this will strongly bias the "histogram" of byte frequencies, the right kind of data to make arithmetic encoding highly effective. To address your other issue, in realistic picture image data, I would expect pixel level RLE to be the better choice for the original image [I have _no_ concept of what would work better on the XOR output, and that might be worth trying in and of itself], since the count overhead is about the same but is paid to encode less data at the bitplane level. Runs at the bitplane level _probably_ correspond to runs at the pixel level, except occasionally by accident, so you encode about the same length run at either level, you just pay more overhead at the bitplane level. Kent, the man from xanth.