Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!uunet!stanford.edu!csli!kornai From: kornai@csli.Stanford.EDU (Andras Kornai) Newsgroups: comp.compression Subject: Re: Voice Standards Message-ID: <19645@csli.Stanford.EDU> Date: 4 Jun 91 02:30:37 GMT References: <15060@hacgate.UUCP> <1223@ocsmd.com> <1991Jun3.203049.7349@sol.cs.wmich.edu> Distribution: na Organization: Center for the Study of Language and Information, Stanford U. Lines: 20 In <1991Jun3.203049.7349@sol.cs.wmich.edu> campbell@sol.cs.wmich.edu (Paul Campbell) writes: >In LPC coding, the best numbers I have seen were from Dr. >Markhoul (did I get the name right? I don't have my notes pile nearby), in >which he got the bit rate down to a more reasonable 150 bps or less. Dream on! Your average language has some 40 to 60 phonemes, so you need 5 to 6 bits/phoneme. In moderately fast speech there are 15 phonemes/sec -- this gives an absolute lower limit around 80 bps. If you want to code intonation/prosody at all, you will need another 4 bits/phoneme, bringing it up to 100-200bps, depending on speech rate. John Makhoul's work at BBN is at the cutting edge of research -- it is very far from a product, even farther from a standard. I think he is at 300bps (using some very sophisticated vector quantization (VQ) and linear predictive coding (LPC) techniques) which is pretty impressive compared to the 9.6kbps now more or less standard for voice compression, but makes decompressed speech sound pretty synthetic. Andras Kornai (kornai@csli.stanford.edu)