Path: utzoo!utgpu!water!watmath!clyde!att!alberta!firat
From: firat@alberta.UUCP (Firat Uludamar)
Newsgroups: sci.electronics
Subject: Re: SP1000
Keywords: speech recognition
Message-ID: <459@cavell.UUCP>
Date: 26 Jul 88 00:09:48 GMT
References: <6308@aw.sei.cmu.edu> <744@io.ATT.COM> <2352@pt.cs.cmu.edu>
Reply-To: firat@cavell.UUCP (Firat Uludamar)
Organization: U. of Alberta, Edmonton, Alberta, Canada
Lines: 46

In article <2352@pt.cs.cmu.edu> phd@speech1.cs.cmu.edu (Paul Dietz) writes:
>In article <744@io.ATT.COM> tmk@io.UUCP (59481[rjb]-t.m.ko) writes:
>>There is a speech recognition chip SP1000 from General Instrument (?).
>>It uses linear predictive coding and does both recognition and systhesis.
>
>Could you tell us more about this chip? Sounds mighty interesting...
>
>--- phd

SP1000 is a 5V 28-pin chip which can be configured for both speech recognition
and synthesis. It can be used as a memory-mapped I/O device. It has to be 
used with analog signal conditioning circuitry (lowpass and preemphasis
filter, variable gain amplifier, anti-aliasing filter, A/D converter for 
recognition).

Recognition mode: 
Sampling frequency [5.0, 15.9 kHz] can be controlled by sw. It has got a
8-stage all-zero lattice filter to do Linear Predictive Coding (LPC) analysis.
LPC analysis can be performed on frames of digitized speech samples.
The number of samples in a frame can be specified by sw. The result of
LPC analysis is 8 reflection coefficients (8-bit each) which can be retrieved 
from the SP1000. Reflection coeffs have to do with lossless acoustic tube
model for speech. "Linear Prediction of Speech" by Markel and Gray (Springer-
Verlag 1976) is an excellent reference on speech models etc. Also, the
SP1000 computes the avg magnitude of samples in an analysis frame.

Synthesis mode:
Sampling frequency is sw controlled in the range of [4.0, 12.7 kHz]. The
SP1000 uses a ten-stage, all-pole lattice filter for synthesis. Each stage
requires 9-bit reflection coeff which can be written to the chip using
8-bit bidirectional address bus and two different write commands. 8
different excitation sequences (2 constant, pseudo-random seq for
unvoiced speech, voiced excitation, male, female, falsetto, and creaky)
can be chosen through sw. The output is pulse-width modulated signal
which is run through a low-pass filter and a speaker.

More info on SP1000 can be obtained from General Instruments branch in
Chandler, Arizona. 

As for the references on speech chip technology, "Electronic Speech Synthesis
Techniques, Technology, and Applications" by Geoff Bristow (ed.) and
"Computer Speech Processing" by Fallside and Woods contain few interesting
chapters. Also see Montlick et al.'s paper in IEEE ASSP 1983 Conf 
(pp. 499-501).

Firat Uludamar