Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cornell!uw-beaver!uw-june!robertb
From: robertb@june.cs.washington.edu (Robert Bedichek)
Newsgroups: comp.arch
Subject: Re: RISC & context switches
Message-ID: <7274@june.cs.washington.edu>
Date: 14 Feb 89 22:09:22 GMT
References: <784@atanasoff.cs.iastate.edu> <7239@june.cs.washington.edu> <4274@pt.cs.cmu.edu>
Reply-To: robertb@uw-june.UUCP (Robert Bedichek)
Organization: U of Washington, Computer Science, Seattle
Lines: 64

In article <4274@pt.cs.cmu.edu> schmitz@fas.ri.cmu.edu (Donald Schmitz) writes:
>
>A similar thread went around a year ago, and I came up with the idea of CPUs
>with externally addressable register/state files, plus a "scheduling CPU".
>The "scheduler" would make the CPUs context switch by exterally halting
>them, dumping/updating their register file via a DMA or block xfer operation
>to fast memory used as a PCB cache (via the hardware interface to the
>register file), and then restarting them.
>
>The real win is not so much the reduced context switch time, but the ability
>to run the scheduling process on a dedicated CPU in parallel with the "real"
>processes.  The extra cycles available for scheduling can (hopefully) be
>used for more sophisticated scheduling algorithms.  This would be a real win
>in a multi CPU system, as "real" processes could be scheduled to avoid
>conflicts for system resources, such as main memory bandwidth and disk
>accesses.  The hardware cost of this is an extra data/address path to the
>register file, plus some additional multiplexing of the chip pins - not
>insignificant in a really high perf CPU but much less costly than multiple
>register files.  

If the processor is halted while the dumping of registers is going on
then you don't need any extra data paths to the registers.  The CDC
6600 did what you describe, its PP (Peripheral Processors) made the
processor do an "exchange jump", where the registers were swapped with
an image in memory.  I don't know where the scheduling algorithm was
done though.  The 6600 was considerably easier to program than the
PP's, so I suspect that it was done on the 6600.  (The relative
difficultly of programming is generally a problem with dedicated
special purpose attached processors, such as IO processors.  It can be
done, of course, but faced with the decision of where to implement some
new feature, system programmers tend to put it on the main CPU.)

And if the processor is going to be waiting while its registers are
dumped, why not just have the processor do the dumping ... and now
the scheme has degenerated to the software solution.

I don't see any advantage to your scheme in current general purpose
systems.  If you want to run the scheduling algorithm in parallel, then
why not just run it on another "real processor"?  Why statically
allocate a machine to an activity unless it is a big win in doing so?

>If you don't want to build mutant chips, you can do a
>similar thing with conventional processors, shared memory, interrupts and
>software, without quite the savings in the raw context switch time (but
>still a win in scheduling time and hopefully a big win in overall
>utilization).

Right, but what's the difference between this (degenerating to having
everything done in software) and what is done "conventionally" on
shared memory multiprocessors (e.g., Sequent)?

>
>Anyway, I got 2 or 3 responses from places working on such systems, although
>I still haven't seen one released.
>
>Don Schmitz	(schmitz@fas.ri.cmu.edu)
>-- 

	Rob

    "Live to code
     Code to live"

beg, plea to all: run spell on your text before posting