Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!casbah.acns.nwu.edu!ils.nwu.edu!sandell From: sandell@ils.nwu.edu (Greg Sandell) Newsgroups: comp.music Subject: Re: Timbre Perception and Orchestration Message-ID: <2139@anaxagoras.ils.nwu.edu> Date: 17 Jun 91 18:11:31 GMT References: <2118@anaxagoras.ils.nwu.edu> <1991Jun17.030934.499@cynic.wimsey.bc.ca> Sender: news@ils.nwu.edu Reply-To: sandell@ils.nwu.edu (Greg Sandell) Organization: The Institute for the Learning Sciences Lines: 150 In article <1991Jun17.030934.499@cynic.wimsey.bc.ca>, curt@cynic.wimsey.bc.ca (Curt Sampson) writes: > > One thing that I am struck by, however, is the rather unquantified > nature of your descriptions of the timbre of various instruments. > For example, you call a claranet "dark" and an oboe "bright." The posting itself never said anything about the clarinet being "dark," but, I get your point. > > Not having seen any of the details of your research, it could well > be that you have quantified the timbres much better there than in > your summary. The scale that I used for brightness and darkness was "centroid." Centroid refers to the distribution of spectral energy in a complex sound. You calculate it by weighting each frequency component by its amplitude, summing all such values, and dividing it by the sum of the amplitudes alone. The division step factors out the amplitude and leaves a single frequency which identifies the midpoint of spectral energy concentration. Consider the following 4-harmonic spectra, each with a fundamental of 100 Hz. The amplitude scale shown is linear. The spectra are identical except for the third harmonic. In the latter spectrum, the distribution of spectral energy is shifting slightly higher in frequency. 8 8 8 | | | | | | | 6 | 6 | | | | | | | | | | | | | 4 | | | | | | | | | | | | | | | | | | 2 | | | 2 | | | | | | | | | | | | | | | | | | | | | | | | |____|____|____|___________ |____|____|____|___________ 100 200 300 400 100 200 300 400 The first spectrum's centroid is calculated as: (8*100)+(6*200)+(4*300)+(2*400)/8+6+4+2 = 4000/20 = 200 Hz. The second spectrum, if you work it out, yields a higher centroid (216.7 Hz.) This measure has been used with great success in perceptual experiments of timbre; that is to say, the magnitude of listeners' evaluations of timbres of different degrees of brightness and darkness are frequently paralleled (correlated with) by the centroids for those sounds. Research showing these results have been reported in Grey(1975), Grey & Gordon (1978) and Wessel (1978). I think that the first experiments to show its perceptual significance were by Lichte(1941) and von Bismarck(1974). Beauchamp(1982) provides the most explicit published definition of centroid. (Citations below.) > different instruments often have very > different distributions of the energy within those upper harmonics. > Some instruments have a lot of energy concentrated in a few harmonics > (such as the oboe--or so my ears tell me :-)) and some have their > energy spread out more evenly over many harmonics (piano). I suspect > that these differing distrubtions would make quite a difference in the > blending characteristics (and recognition characteristics, for that > matter). Right you are. Centroid is a statistical convenience, but obviously an impoverished representation of timbre. I have experimented with other ways of comparing spectra but haven't found any especially effective ones yet. One thing I haven't tried, which is suggested to me by what you say here, is defining some upper frequency region and taking *its* centroid. I'll let you know what I learn. But there is another representation of spectrum which collapses it down into three values (rather than one, as in centroid). This is the "tristimulus method" by Pollard & Jansson (1982). They break up the spectrum into: percentage of energy in the fundamental, percentage of energy in harmonics 2-5, and percentage of energy in all harmonics above 5. I haven't yet found a great use for this measure, myself. > > Another thing to look at would be the amount and distribution of > non-harmonic energy in an instrument's sound (the scraping of the bow, > and the like). In my study, I account for this by quantifying the amount of precedent noise at attack time. This turns out to be a pretty strong cue for blend. However, I found that a more general measure, the duration of the attack time, matched more closely to the judgments. > > This might lead to some interesting expriments with computer-generated > tones of varying harmonic structure. Synthesized tones created with a > decent additive synthesizer would give you far more flexibility when > testing blends of various kinds. The "John Grey tones" that I used *were* additive synthesis descriptions of the sound, by the way...that's what made it possible for me to analyze higher-level acoustic properties such as harmonic synchrony, inharmonicity, etc. > It would also provide a good control > in that one would expect that synthesized waveforms with characteristics > similar to acoustic instruments would generate similar results when > blended for listeners. That is to say, if you have two waveforms with > a concentrated peak in the upper harmonics and they don't blend, but > two acoustic waveforms with a concentrated peak in the upper harmonics > do blend, there's obviously something else we should be looking for as > an important factor in blending. Well, centroid comes in "first place" in my experiment, but of course it's not the only acoustic factor. It would not be hard to magnify the differences in attack characteristics and envelope similarity to override what should be a "good blend" from the perspective of spectrum content. > So perhaps you could do a few experiments in this area too. It's only > June, so I'm sure that you'll have plenty of time to research this > whole area and fit that into a brief Appendix in your dissertation. :-) If one of my committee members dies on me, you'll be the first person I call.... :-) Thanks for your response! -- Greg Sandell sandell@ils.nwu.edu Here are the sources I cited: Beauchamp, J.W. (1982). Synthesis by spectral amplitude and 'Brightness' matching of Manalyzed musical instrument tones. Journal of the Audio Engineering Society 30, 396-406. Grey, J.M., & Gordon, J.W. (1978). Perceptual effects of spectral modifications on musical timbres. Journal of the Acoustical Society of America 63, 1493-1500. von Bismarck, G. (1974a). Timbre of steady sounds: a factorial investigation of its verbal attributes. Acustica 30, 146. Lichte, W.H. (1941) "Attributes of complex tones," Journal of Experimental Psychology 28, 455-480. Wessel, D.L. Low dimensional control of musical timbre. Tech. Rept. 12, IRCAM, Paris, 1978. Pollard, H.F. and Jansson, E.V. (1982), "A tristimulus method for the specification of musical timbre." Acustica 51, 162-171.