Path: utzoo!utgpu!water!watmath!clyde!att!alberta!firat From: firat@alberta.UUCP (Firat Uludamar) Newsgroups: sci.electronics Subject: Re: SP1000 Keywords: speech recognition Message-ID: <459@cavell.UUCP> Date: 26 Jul 88 00:09:48 GMT References: <6308@aw.sei.cmu.edu> <744@io.ATT.COM> <2352@pt.cs.cmu.edu> Reply-To: firat@cavell.UUCP (Firat Uludamar) Organization: U. of Alberta, Edmonton, Alberta, Canada Lines: 46 In article <2352@pt.cs.cmu.edu> phd@speech1.cs.cmu.edu (Paul Dietz) writes: >In article <744@io.ATT.COM> tmk@io.UUCP (59481[rjb]-t.m.ko) writes: >>There is a speech recognition chip SP1000 from General Instrument (?). >>It uses linear predictive coding and does both recognition and systhesis. > >Could you tell us more about this chip? Sounds mighty interesting... > >--- phd SP1000 is a 5V 28-pin chip which can be configured for both speech recognition and synthesis. It can be used as a memory-mapped I/O device. It has to be used with analog signal conditioning circuitry (lowpass and preemphasis filter, variable gain amplifier, anti-aliasing filter, A/D converter for recognition). Recognition mode: Sampling frequency [5.0, 15.9 kHz] can be controlled by sw. It has got a 8-stage all-zero lattice filter to do Linear Predictive Coding (LPC) analysis. LPC analysis can be performed on frames of digitized speech samples. The number of samples in a frame can be specified by sw. The result of LPC analysis is 8 reflection coefficients (8-bit each) which can be retrieved from the SP1000. Reflection coeffs have to do with lossless acoustic tube model for speech. "Linear Prediction of Speech" by Markel and Gray (Springer- Verlag 1976) is an excellent reference on speech models etc. Also, the SP1000 computes the avg magnitude of samples in an analysis frame. Synthesis mode: Sampling frequency is sw controlled in the range of [4.0, 12.7 kHz]. The SP1000 uses a ten-stage, all-pole lattice filter for synthesis. Each stage requires 9-bit reflection coeff which can be written to the chip using 8-bit bidirectional address bus and two different write commands. 8 different excitation sequences (2 constant, pseudo-random seq for unvoiced speech, voiced excitation, male, female, falsetto, and creaky) can be chosen through sw. The output is pulse-width modulated signal which is run through a low-pass filter and a speaker. More info on SP1000 can be obtained from General Instruments branch in Chandler, Arizona. As for the references on speech chip technology, "Electronic Speech Synthesis Techniques, Technology, and Applications" by Geoff Bristow (ed.) and "Computer Speech Processing" by Fallside and Woods contain few interesting chapters. Also see Montlick et al.'s paper in IEEE ASSP 1983 Conf (pp. 499-501). Firat Uludamar