Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!mcsun!ukc!dcl-cs!aber-cs!athene!pcg From: pcg@cs.aber.ac.uk (Piercarlo Grandi) Newsgroups: comp.arch Subject: Re: ~8-job "knee" in response curves on Suns (was Re: IBM RS6000) Message-ID: Date: 22 Jan 91 21:55:25 GMT References: <5257@auspex.auspex.com> <3956@skye.ed.ac.uk> <1991Jan16.231017.2530@csn.org!datran2> <6123@exodus.Eng.Sun.COM> <2653@krafla.rhi.hi.is> Sender: aro@aber-cs.UUCP Organization: Coleg Prifysgol Cymru Lines: 89 Nntp-Posting-Host: teachk In-reply-to: magnus@rhi.hi.is's message of 21 Jan 91 09:06:18 GMT Posting-Front-End: GNU Emacs 18.55.4 of Thu Nov 23 1989 on athene (berkeley-unix) On 21 Jan 91 09:06:18 GMT, magnus@rhi.hi.is (Magnus Gislason) said: In article <2653@krafla.rhi.hi.is> magnus@rhi.hi.is (Magnus Gislason) writes: magnus> In <6123@exodus.Eng.Sun.COM> tomw@binkley.Eng.Sun.COM (Tom magnus> Westberg) writes: tomw> And I just tried it on my SPARCstation2 (4/75) running a few tomw> window systems, but not much else: tomw> 2: 4243 switches / second tomw> [ ... ] tomw> . tomw> 15: 3381 switches / second tomw> 16: 1966 switches / second tomw> [ ... ] magnus> And for the IBM RS 6000/320. magnus> 2: 1722 switches / second magnus> [ ... ] magnus> 25: 1192 switches / second magnus> The IBM RS 6000 doesn't seem to be very fast at context magnus> switching in comparison with other machines. Frankly, this benchmark is about the *shape* of the context switch curve (given some specific sequences of schedules), which is largely a function of the MMU architecture. The context switch *performance* instead is a critical function of how badly botched is the scheduler, usually, and the MMU architecture has a minimal influence. Above we see that context switches cost (on implementations that can do 20-30 million native instructions per second) from a few thousand to over a dozen thousand instructions, which is well over the few dozen or few hundred it takes to save and reload CPU, FPU and MMU contexts. So be warned: in the above you are largely comparing the schedulers, not the MMUs, if you look at switch *rates*. Under most Unix kernel implementations the scheduler is invoked on every context switch, to determine which process should run next. This in itself is incredibly inane, even worse, the traditional implementation of Unix kernel signaling means that unless special care has been taken (and very few take care) the entire process table has to be scanned linearly by the scheduler as soon as it is invoked just to discover which processes are runnable. The combined effect is that the scheduler overhead is of the order of *twenty* times the cost of the context switch. In the original PDP-11 Unix Kernel implementation both design choices made some good sense; very few processes around, most strongly IO bound (shells, editors, ...), a few strongly CPU bound (nroff, chess, compiles, ...), process table not larger than a handful of dozen entries. Also, the Unix designers said they had gone for simplicity over efficiency. On contemporary machines the same tradeoffs no longer make sense at all, just like expansions swaps or caching entire page tables. Now let's context switch back to AIX/6000: its kernel implementation has no relationship whatever with that of traditional Unix, supposedly. I would be very suprised if it had to do a process table scan on every reschedule or on every wakeup(). It must have some other bogosity. It is amusing to note that it is not difficult to reduce to insignificance rescheduling overheads: like in MUSS one can split the scheduler into a short term scheduler that manages a runnable set of a fixed number of processes (say 16 or 32), using some simple rule like strict priority of FIFO, and a medium term scheduler that determines which process should be in the runnable set. This means that policy calculations have to be made only by the medium term scheduler and invoking the short term scheduler will cost little more than the hardware switching overhead. Once you are on this path, a multithreaded, multiprocess kernel suddendly becomes a distinct possibility. Now back to our regularly scheduled :-) register/optimizer/pager/MMU debates. -- Piercarlo Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk