Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!rpi!dali.cs.montana.edu!milton!wiml From: wiml@milton.u.washington.edu (William Lewis) Newsgroups: alt.sources.d Subject: Re: Compressing pictures (LONG) Summary: GIF, LZW, Huffman, images, MagicFormat, pet peeves, ... Message-ID: <6873@milton.u.washington.edu> Date: 30 Aug 90 08:00:46 GMT References: <2781@network.ucsd.edu> <1990Aug29.155925.17597@zorch.SF-Bay.ORG> Organization: University of Washington, Seattle Lines: 95 In article sean@ms.uky.edu (Sean Casey) writes: >Doesn't GIF allow a "plug-in" compressor? Surely it doesn't _require_ one >to use LZW... does it? Yes, it *does* require you to use LZW, with no provision that I know of for other compressions, at least in the GIF87a spec (GIF90 or whatever it was that was just released may be different, but I doubt it, and besides, if we don't support the old GIF decoders we may as well go with a completely new format.) This is one of my gripes with the GIF format, and one reason I liked the .PKX format better (it supported pixel aspect ratios too =8) ) except that it died an early and local death... Anyway ... I'd like to add my own thoughts to this discussion. I've been thinking about something like this for some time, though I haven't done any actual experimentation yet. A couple of points however ... (Keep in mind that everything in this post should be prefaced with "I think", "IMHO", or "What is everyone else's opinion on"; but as a rudimentary form of article compression I have collapsed all of those notices into this one.) * Taking differences is probably much, much better than XORing, since the point is to take advantage of the vertical continuity of the image. The problem is that it doubles the number of values the compressor will have to deal with -- from max->min to min->max, which is 2*(max-min) values. The obvious solution is to wrap around, ie, use modulo(max-min) arithmetic, except that this overlays some of the least frequently used values (the extremes) on the most frequently used ones (small values), making the huffman/arithmatic coder less efficient. I don't know whether this outweighs the advantage gained from the smaller number of values though. I guess that would be determined by experiment. Or, since it is a fairly trivial-to-code change, it could be determined individually for each image, based on that image's characteristics. This would only take extra effort in the compression stage, not decompression, which is what we want. * Arithmetic coding (or Huffman) is probably better than Lempel-Ziv. LZW compresses common strings in the data, which will be rare in images (such as ones with large numbers of colors or grayscales) which have much `noise'. Huffman or arithmetic coding compresses frequently- ocurring single sample values, which should be nicely grouped already because of the differencing in the previous step. For those less familiar with arithmetic coding: Arithmetic coding and Huffman coding are very similar. Both attempt to compress the data by representing more common values with shorter bitstrings. However, arithmetic coding has a number of advantages over Huffman; see below. * Pixel data are not the entire image; obviously, any format needs some sort of header, if only to include a magic number and image dimensions. Palette, title, pixel aspect ratio, output device of preference, etc., may also be useful. However, all of these should have some value which indicates "I don't know" to the decoder; pictures converted from .GIF wouldn't have a wellknown aspect ratio, for instance. The decoder could use the most convenient ratio or attempt to guess from the image's dimansions and its knowledge of popular scanners and displays. Or even (gasp!) ask the user. =8) * Since this format is intended to be used on the Usenet, which is not exactly a transparent medium (8 bit or otherwise...) it should probably have some uuencode-like format. Ideally this would be done similar to PBMPLUS' "packed" vs. "normal" formats, except "packed" would be the norm for storage and transmission on 8-bit-clean paths (like FTP). A different "magic number" in the header could let the decoder differentiate. It would be a good idea to take some hints from xxdecode and ABE (if that's the right program; I haven't really looked at either) as to error tolerance and detection. One of the problems in a.s.p. seems to be that signatures and the like in multi-part postings get interpreted by uudecode, inserting spurious data in the decoded file, which throws the LZW off; maybe the encoder could directly support splitting into parts (this gets more complex with every sentence ... =8) ). Arithmetic encoding would be able to handle this fairly easily by setting the output character set to some carefully selected subset of ASCII. * Arithmetic coding has been proved by someone to be "at least as good as" Huffman, and in many cases better, because output bitstreams don't have to be an integer number of bits. Also, adaptive arithmetic seems to be a lot easier to implement than adaptive Huffman, since you can just readjust the intervals. However, since the input data has already been somewhat processed, it might not be necessary to use adaptive compressing (although it would make the postings smaller, it would make the compressor bulkier, slower and more bug-prone...) Again, something for experiment to determine. ... looks like this has expanded into more than "a couple of ideas" ... comments, anyone? opinions? flames? I think I'll try experimenting with this; does anyone have a source of good large-color-palette images (I'd use a.s.p, but the other people in the lab would look askance =8) )? -- wiml@blake.acs.washington.edu Seattle, Washington | No sig under (William Lewis) | 47 41' 15" N 122 42' 58" W |||||||| construction