Xref: utzoo comp.arch:10661 comp.misc:6566
Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!gem.mps.ohio-state.edu!ginosko!uunet!zephyr!tektronix!sequent!jjb
From: jjb@sequent.UUCP (Jeff Berkowitz)
Newsgroups: comp.arch,comp.misc
Subject: long instructions & page faults
Keywords: TRON page fault page boundry
Message-ID: <18766@sequent.UUCP>
Date: 15 Jul 89 19:49:48 GMT
References: <32424@apple.Apple.COM> <226@arnor.UUCP> <33015@apple.Apple.COM> <29418@ism780c.isc.com> <1449@mdbs.UUCP>
Reply-To: jjb@sequent.UUCP (Jeff Berkowitz)
Organization: Sequent Computer Systems, Inc
Lines: 46

In article <1449@mdbs.UUCP> wsmith@mdbs.UUCP (Bill Smith) writes:
>
>What will happen when [a long] instruction crosses a page boundary and
>a page fault occurs?   Is there some trickery that must be written into
>the operating system to avoid thrashing when the instruction is restarted.
>
[Finally a legitimate architecture topic!]

I've seen the paging code for several "well known" CISCs (VAX, 68010/020,
ns32000, i386).  I'm not aware of any "trickery" THAT THE CPU DESIGNERS
PLANNED FOR being required on any of these machines.  The machine can
indeed fault on a piece of an instruction during a return from interrupt
or page fault.  If memory is sufficiently tight, the machine might thrash
here.  This is no more or less serious than thrashing because e.g. the two
data pages required for a block copy can't be kept in the working set.

Page boundries seem to be a fertile source of CPU hardware bugs, however.
On more than one of the processors in the list, achieving *reliable* page
fault response requires the operating system to contain bug workarounds
that can be described as "trickery" - dealing with registers that don't
contain the expected information if the fault occurred during the execution
of a certain opcode which happened to cross a page boundry, etc.

I imagine the page-fault-time (hardware) state save operations on heavily
pipelined CISCs like the 486 and 68040 must be extraordinarily complex.
Perhaps a designer out there can comment about this.  Current generation
RISCs achieve comparable (or better) performance with much lower
complexity in this area, yes?  Complexity always ends up costing money -
silicon that could have been used for functionality rather than sequencing
the state save operations, etc.

The now-defunct Culler 7 had a three stage instruction pipeline with
prebranching.  Return from interrupt (including page fault) involved
having the CPU refetch all the program memory containing all the
instructions that were in the pipeline at the time of the interrupt.
The machine was not interruptible during this time, so the kernel had
to guarantee that all the pages containing all this code were present
and wired down during the interrupt return.  Since the pipeline could
contain branches to branches, etc, this worked out to possibly four
separate pages.  As might be imagined, ensuring that all four pages
were wired down at interrupt return time happened caused significant
complexity in the OS.  I believe this translated into large time cost
in restarting faults, but have no measurements to prove it.
-- 
Jeff Berkowitz N6QOM			uunet!sequent!jjb
Sequent Computer Systems		Custom Systems Group