Path: utzoo!attcan!uunet!portal!cup.portal.com!doug-merritt From: doug-merritt@cup.portal.com Newsgroups: comp.sys.amiga Subject: Re: Sampling at 29KHz (long) Message-ID: <5637@cup.portal.com> Date: 19 May 88 18:22:14 GMT References: <2845@polya.STANFORD.EDU> <734@eos.UUCP> <53788@sun.uucp> Organization: The Portal System (TM) Lines: 180 XPortal-User-Id: 1.1001.4407 A number of questions have been raised about sampled sound by several people, and the following posting is long because it answers all of the ones that weren't answered previously, as well as correcting some errors in statements by previous posters. If you don't care about high quality sound generation, don't read it. Since Tom got a mild flame for saying he was not an expert, I guess I should say that I *am* an expert, as far as the following goes. Trust me. 1/2 :-) Tom Rokicki wrote: > Now let's sample our 10KHz sine wave at 20.2KHz. We >are now slightly off our frequency, and we will see a 10KHz >tone modulated by a 200 Hz carrier; this will sound like two >narrowly separated frequencies beating against one another. >Ugly as sin. The beat frequencies are the sum and difference of the sampling frequency and the highest frequency component in the sample. There will always be a beat frequency, the question is what to do with it. If you have the lowpass filter turned on (always on the the 1000, btw), then you aim to put the beat frequency above the range of the lowpass filter (see pg 156, Fig 5-7 of the Hardware Reference Manual). Otherwise: There are two ways to remove aliasing of the sampling frequency. One is to remove it via an add-on hardware filter (a 200hz notch filter, for instance). The other is to transform the energy of the beat frequency into random noise, which is far less annoying to the ear, and distributes the energy across the entire spectrum, so that any given component of the noise will be very low energy. This is actually the *preferred* method (in absence of a low pass filter), according to the best minds in sampling theory (not me; but I've been to talks on the subject). It's better than a notch filter because you don't lose the original signal at the notch frequency. The way that you transform the alias into noise is to randomly vary the sampling frequency (actually ideally it should follow a Poisson distribution, but random works pretty well). In other words, instead of sampling exactly every N microseconds, you sample every N+random() microseconds, where 0 <= random() <= N. Thus you still get an alias/beat, but its frequency varies randomly on each sample. (BTW the equivalent method for graphics is to sample on a randomly disturbed grid, and Pixar uses this method for improving ray tracing. The human eye does the same thing: outside the fovea, the rods are placed by a Poisson distribution, to eliminate aliasing.) As far as I can tell from a quick glance at the Hardware Reference Manual, this is perfectly feasible if you use Direct (Non-DMA) output, with a maximum resolution of 280 nanoseconds. See page 161, The Audio State Machine. As far as DMA is concerned, intuitively you would think you're stuck with aliasing due to a fixed period, but you *might* be able to pull the same trick if you period-modulate one channel with random data in another. See pg 151, Table 5-5. Tom again: > Oh, so you want to know how to create the smaller samples from >the larger ones. I'll leave that to the experts out there. It's easy...you just do a (weighted) average down. If you go from 256 to 32 samples then you just average each 8 adjacent samples (weight of 1 since 256 = 8*32) in the source into one sample in the original. If you wanted to go from 256 to say 49 samples, then you'd average each 5.224 samples (5.224 = 256/49) together. In other words, the first destination sample equals the sum of the first five source samples, plus 0.224 times the sixth source sample, divided by 5.224. This leaves you with 0.776 worth of the sixth sample leftover, so you for the second destination sample you take that, plus the next 4 samples, plus 0.448 (= 5.224-4-0.776) times the fifth sample, all divided by 5.224. And continue. The reason this works is that averaging is the time domain equivalent of a frequency domain low pass filter. Now as to Phil's posting: Phil Stone then posts to critique Tom's perfectly good article, in particular about its length and unrealistic examples. Just for the record, Phil's article was almost as long as Tom's (66 vs 84 lines), and used even less realistic examples, because he concentrated on what he wanted to see added to the system, where everyone else has been talking mostly about what *is*. People who live in glass houses... Phil also writes: >Compute a 256-point sine wave and put it in memory (32-byte sine waves >don't cut it in my book - even a tin ear can hear the interpolation >noise in fixed, jagged steps that big). With a fixed sample-playback >increment and a maximum rate of 29 KHz, the highest frequency you can >generate is 29000/256 = 113 Hz! - just about TWO OCTAVES below A440(!) That's why Tom (or was it Chuck?) talked about 32 byte samples...they were being realistic about the current hardware, you see. Besides, what you're talking about has to do with accuracy of reproduction of the higher frequency harmonics need to synthesize, say, a musical instrument's timbre, not "the highest frequency you can generate". The highest frequency component is strictly a function of sample rate. What you're talking about is the highest frequency *fundamental*. And 113hz doesn't give much range, now does it? Nonetheless I'd have to agree that it would be nice if you could sample faster so as to raise the frequency of the highest achievable fundamental, while still accurately reproducing timbre. But note that you *have* the following feature: >With this same sine wave and a variable (intergral.fractional) sampling >increment, one could generate a maximum frequency of 14 KHz! My guess is that you can do this already, as I discussed above. >Your postscript tosses off a reference to *making* the other octaves of >sound - this is called interpolation, a process which has earned many Ph.D's >over the last ten years. Try interpolating 4 octaves from one sample *live* >sometime (can you say "Cray killer?") Gross exaggeration. The papers on the topic of interpolating down from larger to smaller waveforms occurred a lot more than 10 years ago, and a 68000 with a math chip is perfectly capable of keeping up with the demands. A Cray could do a full orchestra in real time. Then Tom posts again: >Yes, but because of the integral.fractional nature of your incrementing, >you are guaranteed to introduce some amount of garbage into your >sound; all your peaks won't peak at the same point, etc. Although >this effect probably won't become really bothersome until you are up >to about 1/8 of the maximum frequency or so. Quite accurate, but there will ALWAYS be some garbage of SOME sort introduced. The integral.fractional notion does a better job of minimizing it than if you just truncate!!! The main problem introduced by that method is that you can no longer have a fixed length sample that is exactly one period long, which nominally forces you to use a non-DMA method. Except that 1) Phil was making a wish list for something new, and you could conceivably add hardware features to restart the sample at the appropriate point beyond the beginning, and 2) my suggestion about using a DMA channel for period modulation could probably fix this, too, if that in fact works. >But it only needs be done once for each sample . . . even a full FFT, >filter, and reverse FFT doesn't take long for a 256 byte sample. Not >real time, true, but for synthesis? Faster than you can figure parameters >for the sample. I guess everyone is assuming you need to do a full FFT, that's why they have problems with speed. But the type of filtering required doesn't need a full FFT; it's just a question of averaging. >Well, I'm trying to learn myself, so I presented what I think I >understand to be the case, and I hope to be corrected where I am >wrong. I hope Phil and Chuck feels the same way, or I'm gonna get flamed! :-) And finally, Chuck McManis writes: > The Amigas low pass filters start cutting of >frequencies above 7Khz and pretty much eliminate everything above >14Khz. That should be 5Khz and 7Khz, respectively. See pg 154-157, Aliasing Distortion. >That means that even your golden ear may have difficulty >in hearing the differences. (You will always get .4% distortion >because that is the monotonic difference between to 'points' in >space.) Exactly. Except that again, this is where the trick of adding a random jitter to the sampling frequency wins, because it distributes the distortion throughout the audio range, rather than consistently creating the same error on every tone. Wow, made it all the way to the end, did you? Must be real interested in the topic! Doug --- Doug Merritt ucbvax!sun.com!cup.portal.com!doug-merritt or ucbvax!eris!doug (doug@eris.berkeley.edu) or ucbvax!unisoft!certes!doug