Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!apple!oliveb!sun!martyi
From: martyi@sun.Eng.Sun.COM (Marty Itzkowitz)
Newsgroups: comp.arch
Subject: Re: DMA on RISC-based systems
Summary: CDC 6600 and 7600 characteristics
Message-ID: <108123@sun.Eng.Sun.COM>
Date: 5 Jun 89 22:24:03 GMT
References: <46500067@uxe.cso.uiuc.edu> <28200325@mcdurb> <2819@scolex.sco.COM> <1443@dell.dell.com>
Organization: Sun Microsystems, Inc. - Mtn View, CA
Lines: 59


The 6600 and 7600 differed in a number of respects, one of which was
the PPU architecture.  On the 6600 all of the PPs were equivalent,
although the OS ( at least the version developed at LBL) treated
them differently.  Every PPU could access every channel, and could
read and write anywhere in central memory, and could exchange-jump
(context switch) the CPU.  On later, 20 PPU versions, each
set of 10 PPUs shared a set of channels, and I don't believe
one could talk to the other set's channels.  The CPU on the 6600
could NOT do its own context switches, and system calls were handled
by placing the request in a known location relative
to the process (job, task) address space (word 1, actually).
The monitor PPU, so designated by software, checked these words,
and then assigned one of the other PPUs to process a request.
Later versions of the machine did have a central processor exchange
jump instruction.  A two CRT display, with refresh done entirely
in SW, was managed by one of the PPUs.

On the 7600, there were several types of PPUs.  PPU zero,
also known as the MCU, or maintenance and control unit, could
read and write anywhere in central (small core) memory, and could
send stuff on channels to the other PPUs.  It could also do
(force) an exchange jump in the CPU.  The other PPUs came in
either high or low-speed versions.  The high-speed ones worked in pairs,
and shared a common external channel to a disk, for example,
and a single channel to central memory.  Each pair's channel went to
a specific hard-wired buffer in cnetral memory, and generated a CPU
interrupt (exchange jump) whenever the buffer was half full/empty,
or when it executed a specific instruction to do so.  The CPU
managed copying the data out of the hard-wired buffer,m typically
into large core memory, since that was the fastest path, and telling
the PPUs when it was OK to send more data.  The CPU could reset its
buffer to a pair of PPUs and generate an interrupt to them.
A high speed buffer was 400 (octal) words, and a disk sector was
1000 (octal) words.  The CPU got 4 interrupts for each disk sector.
High-speed PPUs also had channels connecting the pair, so that, with
much cleverness, one could actually stream data at close to 40 Mb/s
from disk, with one PPU reading the disk, and the other dumping a
previously read sector to CM.  On the 819 disks, one had about
8 microseconds between sectors to avoid missing revs, requiring
a hand-off between the PPUs of the pair, and the CPU.  On LBL's
system, we could do it in time.

Slow PPUs, such as needed for a hyperchannel, worked as individuals,
and had a half-size buffer, again with handshaking between PPU and
CPU at the half-way mark.  The CPU on the 7600 did have an exchange
jump instruction.  Non-privleged tasks could only exchnage to an
address given in its XJ package (some 16 60-bit words);  on error,
the exchange went to a second address.  (Normal Exch. Addr and
Error Exch. Addr, respectively).  Privileged tasks, i.e., the OS,
could exchange to anywhere.  A context switch took 28 clocks
of 27.5 nanosec. each, counting from the time the XJ instruction
was the next to issue to the time the first instruction from the
new context could issue.  For scalar arithmetic, the 7600 was
the fastest machine in the world until the Cray-2.


	Marty Itzkowitz,
		Sun Microsystems