Path: utzoo!attcan!uunet!super!udel!gatech!bloom-beacon!think!ames!oliveb!amdahl!kevin From: kevin@amdahl.uts.amdahl.com (Kevin Clague) Newsgroups: comp.sys.amiga.tech Subject: Re: 2 More Questions Keywords: fft, regression, time-series analysis Message-ID: Date: 9 Sep 88 14:48:12 GMT References: <8809071942.AA05026@cory.Berkeley.EDU> <39616@linus.UUCP> Reply-To: kevin@amdahl.uts.amdahl.com (Kevin Clague) Organization: Amdahl Corporation, Sunnyvale, CA 94086 Lines: 76 In article <39616@linus.UUCP> eachus@mitre-bedford.arpa (Robert I. Eachus) writes: >In article <8809071942.AA05026@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes: >>>> QUESTION#2) Is it possible with software and hardware capabilities >>>> to digitize a song and have the computer convert the >>>> wavelengths into notes/sheetmusic and then print it >>>> out? > > This would certainly be doable on an Amiga (not in real-time of >course), and it would be a lot of fun (and a lot of work). First >digitize the signal, then do a sequential FFT with a sliding window >128 samples long. Use this to decide when new notes begin. (Look >every 16 samples or so to see if succesive power spectra are >proportional (same note increasing or decreasing) or not. One problem is that FFTs are linear, and the musical scale is logarithmic. When you do you 128 sample FFT, each of the 64 elements in the real and imaginary result are spaced equally in the frequency domain. Each of the half steps you are trying to quantify this stuff into are spaced by a factor of 2**1/12 (sorry fortran notation here). Given this, your FFT sample must be much larger to account for the close spacing at the low end (55 Hz for low, low, low A). With the sliding window FFT, you must also take into account the the fact that any wave (even one cycle) will appear to be in the spectra for the length of one FFT sample. Thus things appear to be on longer than they truly are. > > Now transform each sample interval determined above using a >larger number of samples. Determine the lowest fundamental frequency, >and any notes within an octave above it. Use templates (transforms of >samples of a single instrument playing a single note) to do a linear >regression best fit to match notes being played to the instruments >playing them. Analyze the residues for notes in the next highest >octave and refit if necessary. Oh, I see. Use the first (cheap) FFT to get a rough feel of if there is anything there at all. I still don't see how this helps you with chords that are spaced (in the half step domain) farther than one octave apart. I guess you are presuming monophonic music. It is not so easy for polyphony. What do you do for two trumpets or an oboe and a flute? Even worse, how about a whole orchestra? Theoretically possible eh? I don't mean to stiffle your creative juices, or have you stop commenting, I've just thought about this issue a lot and don't have many people to talk to about it). You seem to know some about FFTs. How does the phase of two instruments playing the same note affect the result? > > Do any ADSR (attack, delay, sustain, release) analysis necessary >to determine when each particular note ends, and you have all the >information necessary to print the music (and to reconstruct it and >subtract the power spectra) to make sure you got it right. > > Robert I. Eachus Well, there are other issues.... time signature recognition, tempo and tempo changes, time signature recognition, putting in measure bars and rests, key signature recognition and accidentals (this one is relatively easy), volume markings (piano, forte) and crescendo's. On the surface it seems simple. I wish it was. I've been thinking about this subject for many years. It's enough for a hobby of a lifetime. kev -- UUCP: kevin@amdahl.uts.amdahl.com or: {sun,decwrl,hplabs,pyramid,seismo,oliveb}!amdahl!kevin DDD: 408-737-5481 USPS: Amdahl Corp. M/S 249, 1250 E. Arques Av, Sunnyvale, CA 94086 [ Any thoughts or opinions which may or may not have been expressed ] [ herein are my own. They are not necessarily those of my employer. ]