Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!usc!apple!oliveb!amiga!jimm From: jimm@amiga.UUCP (Jim Mackraz) Newsgroups: comp.graphics Subject: Re: Laplacian Pyramid Image Compression? Message-ID: <5532@amiga.UUCP> Date: 10 Apr 90 20:20:28 GMT References: <8795@pt.cs.cmu.edu> Reply-To: jimm@superman.UUCP (Jim Mackraz) Organization: Commodore-Amiga Inc, Los Gatos CA Lines: 80 In article <8795@pt.cs.cmu.edu> mlm@nl.cs.cmu.edu (Michael L. Mauldin) writes: )Has anyone in UseNetLand implemented the Laplacian Pyramid image )compression algorithm? I believe it was developed at RCA -- I've toyed with it; sometime's it's called the Burt pyramid. I don't have the specific reference (sorry), but a paper appears in a book on "Multi-resolution image processing" or something, edited by Rosenthal, I think. yeesh. )The researchers were claiming 8 to 1 compression for greyscale images. It's not a compression routine, per se, rather a frequency band separation which has lower sample rates for lower frequency bands. The highest frequency band has the same number of elements as the original picture, but now they can be signed (!) so it's a priori larger. This band typically has values concentrated around 0, though, so a huffman scheme gets a big payoff. You mentioned this. Another potentially useful factor is that in compressing motion video, you can often degrade the high frequencies (detail) for moving objects, so the pyramid gives you a good handle on that. You might, for example, transmit the highest band only every few frames or something cheezy like that. For bringing a still image off of disk, it "looks recognizable most quickly" to bring up the lows first, and details later, which you mention. )Synopsis: )To encode an image, you repeat the following process: ) 1. Generate a 1/4 size image which is the 'smooth' of ) the full image by taking a Laplacian of the original. A "subsampled" version of the original after the low frequencies are filtered out. Is it really the Laplacian? I'm rusty. It's convolving with a Gaussian, or plainly taking a weighted average of the neighbors for "every other" point. ) 2. Blow up the 1/4 size image to full size (how was not ) clear), and then take the difference of the original ) and the enlarged, smooth image. Stick the values from step one on a lattice filled with zeros and average again. I think the result is not different from something you could have done in place, without the shrink-then-zoom steps, but you want the small version anyway, and the operations are supposed to be easy to implement in hardware this way (see the reference). I might be remembering poorly on the mechanics of this step. ) For regions of low detail, the difference image is ) mostly zeroes and small integers. *signed* integers. ) 3. Recursively endcode the 1/4 size Laplacian by repeating ) steps 1 & 2 on the smaller image, until you get down to ) a single pixel (or some trivially small image). Nah, just do it two or three times ... ;^) )You transmit an image by starting with the lowest resolution images, so )the image builds up in resolution. They showed this happening in color )for a real estate photo library on the show. I don't know for sure if this technique is part of the RCA/GE/Sarnoff/Intel DVI technology, but I think it came from the same people. There are other things you can do with a cheap and multi-rate sampled frequency band breakdown, such as object recognition, and so on. See the article. This stuff gives great demo (I can't distribute the demos I wrote, sorry). jimm -- -------------------------------------------------- - opinions by me "This voice console is a *must*. I press Execute. `Hello, I know that you've been feeling tired. I bring you love and deeper understanding.' " -lyrics by Kate Bush