Path: utzoo!mnetor!uunet!lll-winken!lll-tis!ames!necntc!linus!alliant!steckel
From: steckel@Alliant.COM (Geoff Steckel)
Newsgroups: comp.arch
Subject: Re: RISC != real-time control
Message-ID: <1704@alliant.Alliant.COM>
Date: 3 May 88 23:27:16 GMT
References: <1521@pt.cs.cmu.edu> <28200135@urbsdc> <4921@bloom-beacon.MIT.EDU>
Reply-To: steckel@alliant.UUCP (Geoff Steckel)
Distribution: na
Organization: Omnivore Technology, Newton, MA
Lines: 56
Keywords: DSP RISC multiprocessing
Summary: I'd like to multiprocess/multitask my DSP

In article <4921@bloom-beacon.MIT.EDU> peter@athena.mit.edu (Peter J Desnoyers) writes:
>Things have changed. It is now possible to get at least one of these
>chips (I think it's the 32020) to do wait states on memory, and
>someone (I don't remember who) has now put their MNP implementation
>and a few other things on this processor, in slow ROM, while their
>signal processing code runs in fast (20ns?) RAM.

The scheme mentioned is very close to one with which I am currently working.
I recently surveyed all the DSP chips for which I could get documentation.
Only the TI 320xxx series have a 'memory access done' pin.  All the other
chips (Moto, AD, NEC, OKI, ...) either have a programmable # wait states or
assume external program or data memory is sufficiently fast to work
synchronously.

This makes ganging of DSP chips using shared (peer-to-peer) global memory
difficult, and makes using mixed slow and fast program memory impossible.
The designers seem to assume:
  1) All parts of the application must run equally fast.
  2) Programs will be small.
  3) Data will be small or only accessed a little at a time.
  4) The DSP chip will own all resources to which it is connected.
  5) Any resource the DSP chip does not own are:
     a) connected via a serial port (a la Transputer, etc), or
     b) sufficiently unimportant that polling a ready line is good enough, or
     c) very fast, or
     d) nonexistent

Can any of the DSP mavens comment on DSP architectures which
  1) Can be connected to large (> 64K) shared memories, which the DSP may
     use, but does not own (i.e. must request and be granted access)
     and whose access time has an upper bound but is not deterministic
     below that bound.
  2) Can run 'background' tasks (servicing panels, SCSI, etc., etc.)
     which require serious processing but much less than the 'foreground'
     task does, preferably with the code in slow (> 70nS, cheap!) memory.
while doing 'foreground' classic DSP?

Right now only TI's 320xx chips seem to have some of the hardware support, with
the large advantage of an extremely narrow program memory path (16 bits!).
The corresponding disadvantage is an extremely baroque and assymmetrical
instruction set.

The chip described is very close to a general purpose RISC chip, but with
the following differences:
  1) Onboard multiply must be very very fast (for convolutions, etc).
  2) sub-wordsize (byte, etc.) performance not very important
     DSP almost (ha) never does divides, but 1000000s of multiplies.
  3) barrel shifter very useful to required
  4) extended precision adder for multiply and accumulate vital
     (e.g. if a * b yields 32 bits, at least 34 bits in the sum, preferably
     more like 40!).  You don't have time to check for overflow.
  5) Floating point is **really** nice, but many applications can be
     bludgeoned into fixed point.  Painfully.
     If you do put in floating point, make it FAST.  Like 2-3 cycles.
  6) Cheaper than the RISC chips are running.  $100/ea in moderate quantity.

     geoff steckel (steckel@alliant.COM)