Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!cs.utexas.edu!uwm.edu!csd4.csd.uwm.edu!info-high-audio-request From: tonyb@juliet.ll.mit.edu Newsgroups: rec.audio.high-end Subject: Re: Data Compression Message-ID: <12205@uwm.edu> Date: 17 May 91 13:52:16 GMT Sender: news@uwm.edu Lines: 59 Approved: tjk@csd4.csd.uwm.edu Originator: tjk@csd4.csd.uwm.edu In article <12182@uwm.edu> sethb@fid.Morgan.COM (Seth Breidbart) writes: >... >Proof that most files cannot be compressed: >There are 2**8000000 differend files 1 megabyte long. There are >2**7999999-1 differend files with lengths less than 1 megabyte-11 bytes. >The ratio of these is approximately 0.001. Therefore, under 0.1% of >all 1 megabyte files can be compressed by at least 1 byetes (which is >0.000011% compression). I'm not going to argue with the arithmetic here, only the logic of applying it to music. Human activity (outside of the whitehouse, anyway) is *very* far from random. Our puny brains couldn't possibly process all the stimiulus it receives if there were not strong repeated elements in everything around us. In other words, I think its misleading to scare people by showing them math that implies that a 1 hour CD with 600 megabytes or so of data is coming from the state-space that includes every possible combination of 600 million random 16 bit words. Earlier in the discussion the issue of encode/decode 'horsepower' came up. If we want to assume *lots* of it (especially at the encode end), I think its reasonable to assume that redundancies in the data could be found and exploited quite handily. Lets start with something inbetween a sine-wave test signal and a live orchestral performance: say, a solo piano piece played on a MIDI-equipped keyboard. The keyboard might have (without compression) 2 or so megabytes of samples, and a MIDI playback sequence for an hour's worth of piano playing might be another 2 megabytes (just guessing -- I bet it's smaller). Add another couple of kilobytes for parameters to set up an ambience processor, and you just accomplished 150-to-1 data compression without trying too hard at all! Progressing to the "very high horsepower" realm, I can imagine a very smart encoder listening to a *real* piano performance and sorting the piano notes out from the hall ambience, and working backwards towards the MIDI file I was just talking about. Add a few cough and program-wrinkling samples, and voila, there's your 150-to-1 compression again. For that matter, using our currently available lossless file compression techniques on the MIDI and sample files would probably up this to 1000:1. OK, so this is beyond our current state of the art, and it has nothing to do with the compression algorithms that DCC will use. But it does have something to say about how non-random music is. Back in the world of current technology, even adaptive delta modulation schemes can deliver significant compression ratios on music without loss (although apparently not significant enough to get things down to DCC bandwidth). That in itself should be a pretty clear indicator that sound we want to hear comes from well within that set of ".1% of all files" that you mentioned. All this said, I'd still rather have a lossless compression scheme for DCC. Tony Berke (tonyb@juliet.ll.mit.edu)