Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ames!killer!texbell!bellcore!faline!thumper!gamma!pyuxp!pyuxe!pyuxf!asg From: asg@pyuxf.UUCP (alan geller) Newsgroups: comp.arch Subject: Re: Context switch tasks Summary: Picking next process can be expensive Message-ID: <497@pyuxf.UUCP> Date: 17 Feb 89 23:47:47 GMT References: <788@atanasoff.cs.iastate.edu> Organization: Bell Communications Research Lines: 78 In article <788@atanasoff.cs.iastate.edu>, hascall@atanasoff.cs.iastate.edu (John Hascall) writes: > A couple of days ago I posted a question about saving/restoring > registers during context switch. > > A couple of people sent me mail saying that register/save restore > was a very small part of the time taken by a context switch. One > person claimed about 1%! > > What all are you people doing during context switch??? > And are they things which are required by the architecture of the > machine or by a particular OS? > > Tasks done at in response to reschedule interrupt (as I see it): > > 1) Block interrupts > 2) Save current processes registers (GP, pagetable-base etc) > 3) Select new process > 4) Invalidate per-process TLB entries > 5) Restore registers > 6) Return from interrupt > ?) assorted bit and register twiddling > > A check of the VMS scheduler interrupt handler shows it to be similar > (and the longest path through the code looks to be 28 instructions, > of course SVPCTX and LDPCTX do the copying of the PCB registers > to/from memory (24 longwords) -- and most assuredly they take a majority > of the time spent). > > ... Well, I don't know what V.2 looks like, but I was working at Bell Labs when we first got System V (this was early 1983) for our VAXen. My office-mate and I were both curious about these things, so we took a walk through the scheduler code. After recoiling in disgust, we rewrote the scheduler (from scratch, mostly in assembler, rather than the 90% C it was delivered in), and got an enormous (5x to 20x measured) performance improvement. Why? Well, first off, the assembler code that implemented the actual context switch, once the new process had been selected, explicitly saved the registers to a register save area, and then did a SVPCTX; similarly, after the LDPCTX, the registers were restored from the save area. Not that the register values in the save area were any different from those saved by SVPCTX, mind you; the duplication was just to make sure, I guess. We cleaned this up by eliminating the duplication of effort. Secondly, the scheduler at that time always did a context switch into process 0 first, and then process 0 would switch into whatever the next process should be. This was done so that the system could idle in proc 0, if no other process was able to run; this would keep some poor user's process from getting charged for all of that idle time. We added an idle process that always ran at the lowest possible priority, so that it would get charged for the idle time, and eliminated the extraneous context switch. Finally, the scheduler always made a full pass through the run list to find the highest-priority runnable process. The run list was a singly- linked, unordered list of process structures. If there were many runnable processes, the time it took to scan the process list could stall the system entirely (about 15 processes doing 1 millisecond naps would stall a VAV 11/780). We replaced the run list with an array of doubly-linked lists (using remque and all those other neat instructions), and used a bitmap that flagged which lists had procs on them; then we could use find-first-set to pick a list, and just dequeue the head of that list to get our next process (we ran over 250 nappers, without problems). We made some other modifications to the priority algorithm, but that's not relevant here. Anyway, there are LOTS of things that operating systems do, beyond saving and restoring register sets. Many of them may be pretty silly, but many OSs are not as clean as VMS in this regard. Alan Geller Bellcore ...!{princeton,rutgers}!bcr!asg