Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!wuarchive!udel!haven.umd.edu!mimsy!leviathan.cs.umd.edu!ogata From: ogata@leviathan.cs.umd.edu (Jefferson Ogata) Newsgroups: comp.music Subject: Re: Digitized Voices Message-ID: <35107@mimsy.umd.edu> Date: 31 May 91 16:31:53 GMT References: <21576.9105302008@uk.ac.keele.seq1> Sender: news@mimsy.umd.edu Reply-To: ogata@leviathan.cs.umd.edu (Jefferson Ogata) Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 41 In article <21576.9105302008@uk.ac.keele.seq1> seq1!@seq1.keele.ac.uk (Rob Barth) writes: |> Has anyone ever written a program that will convert text into speech ? |> |> I am thinking of writing something that will play digitized parts of |> speech (i.e. the ~40 phonemes used in English), and follow the rules of ^^^^^^^^ |> English so that the words sound genuine. ^^^^^^^ What rules?? |> I have the program that will use the digitized data - |> I am really looking for someone who has figured out all the rules, with |> uncommon/ambiguous/exceptional words etc. I've seen commercial software products for this...there was one for the Mac a while back. It wasn't always right. In particular, vocal pitch inflection is a difficult thing to model. There is an old chipset for converting text to speech; Radio (S)Hack used to sell it. There were a couple of commercial black boxes with these chips inside. This is not a trivial problem. To get many words right, you need to do some parsing (e.g. permit, refuse). To get vocal inflection remotely right, you need to do a lot of parsing. As to phoneme sequences, it's much easier to just make a big dictionary than to try to abstract phonological transformations for the English language. You might be able to do it adequately if you restrict the transformations to latinate words and use a dictionary for the rest. But even the transformations for strictly latinate words are still pretty complicated (e.g. sign, signature). You definitely need a good syllabification rule to do any rule-based phonology. -- Jefferson Ogata University of Maryland Computer Science Department "Animals without backbones hid from each other or fell down. Clamasaurs and oysterettes appeared as appetizers. Then came the sponges, which sucked up about ten percent of all life."