Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!wuarchive!udel!haven.umd.edu!mimsy!leviathan.cs.umd.edu!ogata
From: ogata@leviathan.cs.umd.edu (Jefferson Ogata)
Newsgroups: comp.music
Subject: Re: Digitized Voices
Message-ID: <35107@mimsy.umd.edu>
Date: 31 May 91 16:31:53 GMT
References: <21576.9105302008@uk.ac.keele.seq1>
Sender: news@mimsy.umd.edu
Reply-To: ogata@leviathan.cs.umd.edu (Jefferson Ogata)
Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742
Lines: 41

In article <21576.9105302008@uk.ac.keele.seq1> seq1!@seq1.keele.ac.uk (Rob Barth) writes:
|> Has anyone ever written a program that will convert text into speech ?
|> 
|> I am thinking of writing something that will play digitized parts of
|> speech (i.e. the ~40 phonemes used in English), and follow the rules of
                                                                  ^^^^^^^^
|> English so that the words sound genuine.
   ^^^^^^^

What rules??

|> I have the program that will use the digitized data - 
|> I am really looking for someone who has figured out all the rules, with
|> uncommon/ambiguous/exceptional words etc.

I've seen commercial software products for this...there was one for
the Mac a while back. It wasn't always right. In particular, vocal
pitch inflection is a difficult thing to model.

There is an old chipset for converting text to speech; Radio (S)Hack
used to sell it. There were a couple of commercial black boxes with
these chips inside.

This is not a trivial problem. To get many words right, you need to
do some parsing (e.g. permit, refuse). To get vocal inflection
remotely right, you need to do a lot of parsing.

As to phoneme sequences, it's much easier to just make a big
dictionary than to try to abstract phonological transformations
for the English language. You might be able to do it adequately
if you restrict the transformations to latinate words and use a
dictionary for the rest. But even the transformations for
strictly latinate words are still pretty complicated (e.g. sign,
signature). You definitely need a good syllabification rule to
do any rule-based phonology.

--
Jefferson Ogata     University of Maryland      Computer Science Department
"Animals without backbones hid from each other or fell down. Clamasaurs and
 oysterettes appeared as appetizers. Then came the sponges, which sucked up
                     about ten percent of all life."