Xref: utzoo sci.lang:4529 comp.cog-eng:1119 sci.psychology:1881 Path: utzoo!attcan!uunet!cs.utexas.edu!husc6!ogccse!blake!uw-beaver!ssc-vax!bcsaic!rwojcik From: rwojcik@bcsaic.UUCP (Rick Wojcik) Newsgroups: sci.lang,comp.cog-eng,sci.psychology Subject: Re: Regional accents (was: Spelling and Perceptual Mode) Keywords: GB Shaw, orthography Message-ID: <11725@bcsaic.UUCP> Date: 15 May 89 19:25:08 GMT References: <2763@puff.cs.wisc.edu> <60340@yale-celray.yale.UUCP> <3193@tank.uchicago.edu> <5832@cs.Buffalo.EDU> Reply-To: rwojcik@bcsaic.UUCP (Rick Wojcik) Distribution: na Organization: Boeing Computer Services AI Center, Seattle Lines: 33 In article <5832@cs.Buffalo.EDU> lammens@sunybcs.UUCP (Jo Lammens) writes: >In fact there is no such thing as THE B.E. or A.E. pronunciation, or >even THE pronunciation for any one dialect. If one acoustically >analyses people's pronunciation of the same words, the differences >tend to be very large even among speakers of the same dialect. Yet no >one really notices. That is one of the reasons that automatic speech >recognition is so difficult: the identification problem for phonemes >is a complex one, but you don't realize it until you start measuring >things or try to build a system that does it. We humans can easily >shift our perceptual category boundaries around without even being >conscious about it. If you make a technical distinction between speech recognition (i.e. recognition without NLP*) and speech understanding (i.e. recognition with NLP), then it is a bit easier to understand why speech recognition is not very feasible across dialect boundaries. Different dialects have different phonemic representations for the same morphemes, and you can usually establish phonemic correspondences only after you have done a lot of higher level processing. The problem is that it is very difficult to know when you have a phonetically motivated distortion of a single underlying sound, or simply a different phoneme. Even within the same dialect, phonetic distortion associated with varying styles and tempoes can obscure the recognition of the same underlying phonemic string. It can also cause radically different phonemic strings to have the similar or identical surface phonetics. For example, the word 'cigar' can be pronounced casually in such a way that it is identical with 'scar'. This makes it virtually impossible to base speech recognition on acoustic input alone. *NLP = natural language processing (syntactic, semantic, pragmatic) -- Rick Wojcik csnet: rwojcik@atc.boeing.com uucp: uw-beaver!bcsaic!rwojcik