Xref: utzoo comp.sys.mac:25041 comp.cog-eng:876 sci.lang:3914 Path: utzoo!attcan!uunet!lll-winken!ames!mailrus!tut.cis.ohio-state.edu!bloom-beacon!bu-cs!bucasb!merrill From: merrill@bucasb (John Merrill) Newsgroups: comp.sys.mac,comp.cog-eng,sci.lang Subject: Re: Why are there no Speech Recognition products for the Mac?? Keywords: Voice Recognition, Voice Synthesis, Speech, Voice Response Message-ID: <600634966.15179@bucasb.bu.edu> Date: 12 Jan 89 19:02:46 GMT References: <2972@uhccux.uhcc.hawaii.edu> <1029@ditsyda.oz> Reply-To: merrill@bucasb (John Merrill) Followup-To: comp.sys.mac Organization: Boston University Center for Adaptive Systems Lines: 72 In-reply-to: vincent@ditsyda.oz (David A. Vincent) In article <1029@ditsyda.oz>, vincent@ditsyda (David A. Vincent) writes: > > >in article <2972@uhccux.uhcc.hawaii.edu>, pam@uhccux.uhcc.hawaii.edu (.) says: >> Xref: ditsyda comp.sys.mac:17740 comp.cog-eng:609 sci.lang:22 >> >> >> In article <6890> pardo@cs.washington.edu (David Keppel) writes: >> | >>>Good speech recognition hardware can't be more than 5 or 10 > **** >> | >>>years away, can it? >> | >>>-Peter Schachte >> -- It's here! > >No, it is not here. No, indeed, it is not here. It is still a long ways away, in fact. Fact: There is *one* (or maybe two) speaker independent, continuous speech recognition system *in existence*. There are no commercial systems extant. The one system, K-F Lee's SPHINX system, runs on "several SUN-4's with floating point coprocessor boards"...and ties them all down. Furthermore, although it is not an isolated word system, it can only handle a finite vocabulary in a *very* limited grammar. I have seen the Lincoln Labs derivative of SPHINX in operation. It's only about 50X real-time, and it isn't bad...if you're running in an absolutely silent room. But it most certainly *isn't* continuous, speaker-independent recognition. (But let me say one thing. SPHINX is a major advance in the design and construction of speech recognizers. No, it ain't perfect...but it's orders of magnitudes better than anything that came before. I was absolutely astounded when it was announced; it's so much better than anything else around. Since I've seen it (or, rather, something very much like it), I'm even more impressed. I just can't convey how much of an advance it was over the older systems.) >Also, I doubt that the so-called 'speaker independent' systems >mentioned above will really recognize *anybody's* voice. What about >people speaking with strong accents? Or in perfect 'english' but over >background noise? I haven't seen any of the new generations of recognizers with accented english, but the one I have seen can deal with a variety of speaking tempi and conditions (yelling, noise-in-ears, deafened, etc.) As I said before, it didn't deal well with noisy environments. On the other hand, there is an accumulating body of evidence that problems with background noise can be ameliorated by the use of non-standard representations of the input stream, some of which appear to be better able to extract signal from background. >> *** So where are the Voice Recognition systems for the Mac??? *** > >Yes, where? But, by the way, what is voice (as opposed to speech) >recognition? (Or is there no difference? In normal discussion, >'voice' is rarely interchangable with 'speech'.) There is a difference. Voice recognition is talker identification (at least, in my jargon). It's much easier than speech recognition. (You can replace speaker dependence with text dependence, and then identify the speaker that spoke your fixed text, as opposed to identifying the text spoken by your fixed speaker.) -- John Merrill | ARPA: merrill@bucasb.bu.edu Center for Adaptive Systems | 111 Cummington Street | Boston, Mass. 02215 | Phone: (617) 353-5765