Path: utzoo!attcan!uunet!ginosko!usc!apple!wass From: wass@Apple.COM (Steve Wasserman) Newsgroups: comp.dsp Subject: Re: Adjust-Speed CD player? Message-ID: <4320@internal.Apple.COM> Date: 22 Sep 89 22:02:43 GMT References: <61860@tut.cis.ohio-state.edu> <4653@orca.WV.TEK.COM> Organization: Apple Computer Inc, Cupertino, CA Lines: 110 In article <4653@orca.WV.TEK.COM> mhorne%ka7axd.wv.tek.com@relay.cs.net writes: >> I know nothing about DSP other than what I've figured out ... much stuff deleted ... >source, however, I suggest interpolating between the samples by simply >convolving the data stream with a sinc function. > There are two separate problems to be solved here. First: if you spin the CD faster than usual, what do you do with the extra samples? For example: if a CD is sped up such that 52.9 Ksamples/second are read (which represents an increase in speed of 6/5 or 20%), 8.8 "extra" Ksamples accumulate every second. I am assuming, of course, that the sound will be reconstructed by circuitry which operates at a constant 44.1 KHz (or some oversampling multiple thereof, I suppose). The reason I make this assumption is because it would be difficult to construct a variable analog reconstruction filter that would be able to handle a large range of possible sampling speeds (say plus or minus five times the original sampling frequency). This problem is called "sample rate conversion" or something similar in textbooks. I don't think that it can be said that any one sample is "more important" than any other sample because it is a local minimum or maximum. In fact, the method of not dropping these sample as suggested would introduce random noise into the signal. In general, it is easy to convert between two sampling rates that are rational multiples of each other (hence, I chose 6/5 in my example). The first step is to interpolate the signal by the numerator ... convert it to a sampling rate six times the original in my example. This is simply done by adding five zeros between every sample and then applying a digital filter. Zero padding has the effect of replicating the original spectrum a number of times. A filter is used to remove the unwanted copies of the original spectrum. Sorry I can't think of a good way to draw spectra using text only, but diagrams would be helpful here. The next step is to filter out all (or most) spectral energy which would be "aliased" when throwing away the unneeded samples. This involves applying another filter to quiet the components above the Nyquist rate of the signal after the extras are thrown out. After this has been done, four of every five samples can be safely thrown out without distorting the signal. (always the same four out of the five). In practice, the two filters can be combined so the procedure is: zero-pad, filter, and the throw away the unneeded samples. People have found more clever ways of doing this in some circumstances, but in theory, this way is as good as any. Obviously, if you want to change the sampling rate by 7724/137, you have a problem. >All this said, I don't think this is the optimal method for tone shifting, >however it might work for `fast/slow-forward' effects. If you wish to shift >the tones while retaining the same sample rate, I would suggest some sort of >frequency scaling algorithm, perhaps by doing a digital mix with a reference >(digital) carrier (i.e. ref = 100 Hz for a 100 Hz shift upward in frequency), The second problem is: once you've thrown away the right number of samples, how do you make the pitch sound right? Note that a mere frequency translation by digitally mixing in a reference frequency is not exactly what's required to make everything right again. Spinning the CD faster EXPANDS the spectrum of the original sound in frequency -- it doesn't just shift it. (unless, of course, you are looking on a log scale :-) To prove this to yourself, imagine a recording of two notes: concert A (440 Hz) and one octave above it (880 Hz). When we increase the CD speed by 20 %, these two frequencies are changed to 528 and 1056 Hz. Assume that we've thrown away the proper number of samples from the original recording. Now, if we mix the resultant signal with a 88 Hz signal (528 - 440 = 88) and do the proper filtering, we'll get 440 Hz and 968 Hz ... oops, they don't sound like octaves any more. Theoretically, what needs to be done is to compress the spectrum of the speeded-up sound down to its original size. This can be done by applying the previously discussed interpolation/decimation method to the FFT samples of the signal (I think) and then inverse transforming and playing the signal out at the original sampling rate. I'm sure that somebody has come up with a computationally superior method to the one I have suggested. (note: invert this discussion if you want to talk about slowing a recording down.) >> Also, if you're going to remove samples, I think you >> shouldn't use a simple kill-every-nth-sample procedure... > >If you want to throw away samples, you *really* need to filter the data before >doing so, otherwise you will see (hear) aliasing of the data, depending upon >the spectra of the input and how often you are throwing away samples. When >you decimate any sampled data set, you must low-pass filter the data at half >the new sample rate (Nyquist rule) unless you are sure that the data has >no spectral components above half the new sample rate. > >followed by a carrier and lower sideband suppression (Hilbert transform filters >are very easy to implement digitally). At a fast glance, I think this might >work well for moving the spectra of an audio source up/down some arbitrary >frequency, and should be doable with some of the common DSP chips currently >available. > >Mike Horne >Visual Systems Group >Tektronix, Inc. >mhorne@ka7axd.wv.tek.com -- swass@apple.com