Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!genrad!decvax!decwrl!glacier!mips!mash From: mash@mips.UUCP Newsgroups: net.arch,net.micro.att Subject: Re: AT&T MIPS claim [really task-switching] Message-ID: <506@mips.UUCP> Date: Sat, 14-Jun-86 19:57:58 EDT Article-I.D.: mips.506 Posted: Sat Jun 14 19:57:58 1986 Date-Received: Tue, 17-Jun-86 10:26:29 EDT References: <577@scirtp.UUCP> <124@bakerst.UUCP> <583@scirtp.UUCP> <585@scirtp.UUCP> <206@njitcccc.UUCP> <4138@sun.uucp> Reply-To: mash@mips.UUCP (John Mashey) Distribution: net Organization: MIPS Computer Systems, Sunnyvale, CA Lines: 50 Xref: watmath net.arch:3449 net.micro.att:1291 In article <4138@sun.uucp> guy@sun.uucp (Guy Harris) writes: >... >The 68000 does, indeed, not have a single "switch task" instruction, but who >cares? The fact that operation X is performed by a single instruction in no >way implies that operation X is exceptionally fast. Furthermore, I have no >idea how much of the task-switch time on VMS or UNIX is spent doing what the >"load process context" instruction does; it has to figure out which task to >run, for instance, which adds a few more instructions. Guy is right on; furthermore: 1) Register save/restores speed are almsot entirely dominated by memory system time anyway. 2) When measured by the "2 processes writing 1 byte circularly thru pipes" benchmark, each complete UNIX context switch takes on the order of 700 microseconds on a 780. Actual register save/restore time is dominated by write-stalls and data cache misses, which are a function of the memory system, not of the instruction set. The only real difference is in extra instruction-cache misses one may hit by having to do a sequence of loads/stores instead of single micro-coded instructions. Having looked at the code, I guarantee that most of the code is doing other things than saving/restoring registers. 3) Let's try some back-of-the-envelope numbers: a) At 60 cs/second (typical) and 700 usec/cs, the VAX would spend 60*700 = 42,000 usecs, or about 4.2% of the time doing conxtext switches. b) Supposing that that 10% of this time is actually in save/restore, about .4% of the machine might be spent in save/restore (SVPCTX/LDPCTX). Of course, they might be used for other things also. 4) Now, let's try published data: Clark & Levy, "Measurement and Analysis of Instruction Use in the VAX 11/780", 9th Ann. Symp. on Comp. Arch, April 1982. a) LDPCTX and SVPCTX aren't on the top 25 in usage of CPU time, even in VMS Kernel mode. The top 25 instructions use 62% of the total kernel time, and the smallest shown is REMQUE with 1.31%. This was for multi-user workloads. b) MTPR (Move to Processor Register) used 5.27% of the kernel time, and 1.15% of the total CPU time for all processor modes. From this, I infer that the kernel was using 21% of the CPU (1.15/5.27). Hence, the most time-consuming of LDPCTX/SVPCTX could be consuming no more than 1.31% of the kernel, or .27% of the total CPU. Even both together could account for no more than .54% of the total CPU. 5) All of this is consistent in bounding the problem: for time-sharing systems like VAXen, the special context save/restore instructions contribute at most half a percent to performance. [Reminder: this says nothing about whether such instructions are important for real-time systems or other environments. Also, some forms of these instructions have important structural properties or other rationales, but NOT SPEED IN THIS DOMAIN.] -- -john mashey DISCLAIMER: UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD: 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086