Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!lll-winken!unixhub!shelby!csli!poser
From: poser@csli.Stanford.EDU (Bill Poser)
Newsgroups: comp.dsp
Subject: Re: A simple, practical sound board
Message-ID: <16113@csli.Stanford.EDU>
Date: 31 Oct 90 08:33:36 GMT
References: <15912@netcom.UUCP>
Reply-To: poser@csli.stanford.edu (Bill Poser)
Organization: Center for the Study of Language and Information, Stanford U.
Lines: 25


There is no way you can get an hour of speech into a megabyte
with reasonable quality if you just digitize waveforms. Suppose
you sample at a resolution of 8 bits, which is the resolution of
the cheapo ADDACs you can buy for PCs and Macs. For research
purposes and hi-fi people use higher resolution.  That means your
megabyte gets you 1024000 samples. At 3600 seconds in an hour,
that means 284.44 samples per second is the maximum sampling rate
you can use. The corresponding Nyquist frequency is 142.22 Hz,
meaning that you can only represent frequencies below this level.
This is WAY too low. For music people use sampling rates around
44K samples/sec in order to get frequencies up to over 20KHz. For
speech you don't need anything that high. For speech research we
typically sample at 20K samples/sec and low pass filter at 8KHz.
That covers everything significant for speech.  For some purposes
we sample at 10KHz with low-pass filtering at 4KHz.  Engineers
often sample at 8K samples/sec because they expect to be working
with telephone speech, which is limited to the region below about
3200 Hz. Already we're talking about degraded, though
intelligible, speech. So you can see that if you just want to
record and edit waveforms, there is no way you can cram an hour
of speech into a megabyte.  To store speech at around 2000
bits/second as you wish to do is possible but requires non-trivial
coding, and I'm not sure that you will like the quality that
results. I think you're going to need a bigger disk.