Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site frog.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!cybvax0!frog!rfm
From: rfm@frog.UUCP (Bob Mabee)
Newsgroups: net.micro.68k
Subject: Re: Re: PDP11s vs the micros
Message-ID: <290@frog.UUCP>
Date: Fri, 30-Aug-85 14:11:12 EDT
Article-I.D.: frog.290
Posted: Fri Aug 30 14:11:12 1985
Date-Received: Sun, 1-Sep-85 05:46:37 EDT
References: <1617@hao.UUCP> <847@mako.UUCP> <2422@sun.uucp> <2607@sun.uucp> <5874@utzoo.UUCP> <492@oakhill.UUCP>
Organization: Charles River Data Systems, Framingham MA
Lines: 54

Dave Trissel of Motorola explains why the 68020 stores a large state on faults:
>		MOVE   something to memory
>		SHIFT  Reg by immediate
>		MUL    Reg to Reg
>		etc.
> The MC68020 executes the MOVE and the bus unit schedules a write cycle.  Then
> the execution unit/pipeline happily continues executing the instruction
> stream without regard to the final status of the write.  Even if the write
> fails (bus errors) there could be several more instructions executed (in fact
> any amount until one is hit which requires the bus again.)
> 
> Contrast this to chips which redo instructions.  They must soon stop dead in
> their tracks until the write cycle has been verified as properly done. Other-
> wise they would alter the programmers model and invalidate retry.

Quite a few responses seemed to miss the point that this makes the 68020
run a lot faster all the time, not just when the reference causes a fault.
The alternative requires that the CPU store just a PC value that can be
jumped to to restart the program; that means there can be no visible effects
from instructions executed after the one that started the write that got the
error.  In the example, either the shift can't happen until the write is
acknowledged, or the processor has to keep multiple register sets so it
can back up far enough to recreate the state that goes with the bus cycle.

However, there is a big problem with the 68020 fault state on UNIX-like
systems:  the state is (potentially) writeable by malicious users but Motorola
has not provided enough information so we can detect bad states.  We need
	1) Motorola's assurance that no combination of bits fed to RTE can
	   damage or hang up the chip, allow users to enter supervisor mode,
	   or set a booby-trap that will harm the OS or another process.
or	2) a (small) set of checks that will reject all combinations that
	   might do any of those things, while allowing all combinations
	   actually stored by the CPU.

If the OS boils the state down to a PS and PC, which can be easily validated,
then it is lying to the user, because it will allow restarting but the program
will misbehave (in the example, shift a register twice).  If the OS prevents
restarting such cases, it will kill programs that merely happen to get signals
in the middle of 68881 instructions.

The state gets to be writeable when a user instruction faults or (with the
68881) takes a mid-instruction interrupt, and the kernel then decides to
signal the process.  The signal handler runs like a user-level version of
a trap handler, and can return, which should make the stopped instruction
resume.  Signal handlers can themselves be interrupted by other signals, so
there can be a lot of sets of fault data around.  The easiest way for the
kernel to handle this is to put the data on the user stack as part of calling
the signal handler.  (Implementing a parallel, growable stack accessible
only by the kernel to hold the fault data is going to be a big pain.)

So, how about it, Dave?  Can you give us #2 above?

--
				Bob Mabee @ Charles River Data Systems
				decvax!frog!rfm