Path: utzoo!attcan!uunet!datapg!ems!mpp
From: mpp@ems.Ems.MN.ORG (Michael Palmquist)
Newsgroups: comp.cog-eng
Subject: Re: Voice in Interface Design
Message-ID: <6993@ems.Ems.MN.ORG>
Date: 21 Dec 88 16:09:02 GMT
References: <6986@ems.Ems.MN.ORG> <2206@daisy.UUCP>
Organization: EMS/McGraw-Hill, Eden Prairie, MN
Lines: 44

In article <2206@daisy.UUCP>, klee@daisy.UUCP (Ken Lee) writes:
> In article <6986@ems.Ems.MN.ORG> mpp@ems.Ems.MN.ORG (Michael Palmquist) writes:
> >I am looking for sources/examples (products, research, design metaphors) of 
> >voice-activated interface
> 
> A year or so ago, I looked into many of the best commercial voice
> products.  Input products are mainly used as a replacement for menus.

Yes. I did find a few research articles in 1987 Conference Proceedings for
ACM's CHI + GI.  Kane & Yuschik (Wang Lab), "A Case Example of HUman
Factors in Product Definition: Needs Finding for a Voice Output Workstation
for the Blind" and Aucella et al. "Voice: Technology Searching for
Needs".

My thought is that if you use voice input you have a tough job of recognition
particularly in an educational setting -- many users, wide variation of
accents and speech abilities. There is also the problem of integrating and
configuring the hardware system. I like Kurtzweil's approach. I don't like the
price.

 
> Voice (and other sound) output is now common.  Even cars talk to you
> these days.  It is especially valuable when other forms of output are
> not available (e.g., no screen) or confusing (e.g., the user is busy
> focusing on some other display).

That's true. There is an issue of digitized (captured) vs. synthesized 
(generated) voice. And subissues: available memory, available storage,
comprehensible output, and modifiablity -- should there be a voice "control
panel" for pitch, tone, speed, male/female, etc. How easy an interface 
would that be to learn?

If you have systhesized voice, you have great flexibility in the
data dictionary, but you've got basically (Rob Swigart) "alcoholic robots 
with speech impediments".

If you digitize, you have zero flexibility. And a huge data dictionary of
sound packets. But you have recognizable, warm speech.

Well, in a nutshell anyway. Thoughts?

Michael Palmquist: software designer, rogue.

@mecc.mn.org or @ems.mn.org