Xref: utzoo sci.lang:4529 comp.cog-eng:1119 sci.psychology:1881
Path: utzoo!attcan!uunet!cs.utexas.edu!husc6!ogccse!blake!uw-beaver!ssc-vax!bcsaic!rwojcik
From: rwojcik@bcsaic.UUCP (Rick Wojcik)
Newsgroups: sci.lang,comp.cog-eng,sci.psychology
Subject: Re: Regional accents (was: Spelling and Perceptual Mode)
Keywords: GB Shaw, orthography
Message-ID: <11725@bcsaic.UUCP>
Date: 15 May 89 19:25:08 GMT
References: <2763@puff.cs.wisc.edu> <60340@yale-celray.yale.UUCP> <3193@tank.uchicago.edu> <5832@cs.Buffalo.EDU>
Reply-To: rwojcik@bcsaic.UUCP (Rick Wojcik)
Distribution: na
Organization: Boeing Computer Services AI Center, Seattle
Lines: 33

In article <5832@cs.Buffalo.EDU> lammens@sunybcs.UUCP (Jo Lammens) writes:

>In fact there is no such thing as THE B.E. or A.E.  pronunciation, or
>even THE pronunciation for any one dialect. If one acoustically
>analyses people's pronunciation of the same words, the differences
>tend to be very large even among speakers of the same dialect. Yet no
>one really notices. That is one of the reasons that automatic speech
>recognition is so difficult: the identification problem for phonemes
>is a complex one, but you don't realize it until you start measuring
>things or try to build a system that does it. We humans can easily
>shift our perceptual category boundaries around without even being
>conscious about it.

If you make a technical distinction between speech recognition (i.e.
recognition without NLP*) and speech understanding (i.e. recognition with
NLP), then it is a bit easier to understand why speech recognition is not very
feasible across dialect boundaries.  Different dialects have different
phonemic representations for the same morphemes, and you can usually establish
phonemic correspondences only after you have done a lot of higher level
processing.  The problem is that it is very difficult to know when you have a
phonetically motivated distortion of a single underlying sound, or simply a
different phoneme.  Even within the same dialect, phonetic distortion
associated with varying styles and tempoes can obscure the recognition of the
same underlying phonemic string.  It can also cause radically different
phonemic strings to have the similar or identical surface phonetics.  For
example, the word 'cigar' can be pronounced casually in such a way that it is
identical with 'scar'.  This makes it virtually impossible to base speech
recognition on acoustic input alone.

*NLP = natural language processing (syntactic, semantic, pragmatic)
-- 
Rick Wojcik   csnet:  rwojcik@atc.boeing.com	   
              uucp:   uw-beaver!bcsaic!rwojcik