Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cornell!uw-beaver!uw-june!robertb From: robertb@june.cs.washington.edu (Robert Bedichek) Newsgroups: comp.arch Subject: Re: RISC & context switches Message-ID: <7274@june.cs.washington.edu> Date: 14 Feb 89 22:09:22 GMT References: <784@atanasoff.cs.iastate.edu> <7239@june.cs.washington.edu> <4274@pt.cs.cmu.edu> Reply-To: robertb@uw-june.UUCP (Robert Bedichek) Organization: U of Washington, Computer Science, Seattle Lines: 64 In article <4274@pt.cs.cmu.edu> schmitz@fas.ri.cmu.edu (Donald Schmitz) writes: > >A similar thread went around a year ago, and I came up with the idea of CPUs >with externally addressable register/state files, plus a "scheduling CPU". >The "scheduler" would make the CPUs context switch by exterally halting >them, dumping/updating their register file via a DMA or block xfer operation >to fast memory used as a PCB cache (via the hardware interface to the >register file), and then restarting them. > >The real win is not so much the reduced context switch time, but the ability >to run the scheduling process on a dedicated CPU in parallel with the "real" >processes. The extra cycles available for scheduling can (hopefully) be >used for more sophisticated scheduling algorithms. This would be a real win >in a multi CPU system, as "real" processes could be scheduled to avoid >conflicts for system resources, such as main memory bandwidth and disk >accesses. The hardware cost of this is an extra data/address path to the >register file, plus some additional multiplexing of the chip pins - not >insignificant in a really high perf CPU but much less costly than multiple >register files. If the processor is halted while the dumping of registers is going on then you don't need any extra data paths to the registers. The CDC 6600 did what you describe, its PP (Peripheral Processors) made the processor do an "exchange jump", where the registers were swapped with an image in memory. I don't know where the scheduling algorithm was done though. The 6600 was considerably easier to program than the PP's, so I suspect that it was done on the 6600. (The relative difficultly of programming is generally a problem with dedicated special purpose attached processors, such as IO processors. It can be done, of course, but faced with the decision of where to implement some new feature, system programmers tend to put it on the main CPU.) And if the processor is going to be waiting while its registers are dumped, why not just have the processor do the dumping ... and now the scheme has degenerated to the software solution. I don't see any advantage to your scheme in current general purpose systems. If you want to run the scheduling algorithm in parallel, then why not just run it on another "real processor"? Why statically allocate a machine to an activity unless it is a big win in doing so? >If you don't want to build mutant chips, you can do a >similar thing with conventional processors, shared memory, interrupts and >software, without quite the savings in the raw context switch time (but >still a win in scheduling time and hopefully a big win in overall >utilization). Right, but what's the difference between this (degenerating to having everything done in software) and what is done "conventionally" on shared memory multiprocessors (e.g., Sequent)? > >Anyway, I got 2 or 3 responses from places working on such systems, although >I still haven't seen one released. > >Don Schmitz (schmitz@fas.ri.cmu.edu) >-- Rob "Live to code Code to live" beg, plea to all: run spell on your text before posting