Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!ames!ncar!oddjob!mimsy!chris
From: chris@mimsy.UUCP (Chris Torek)
Newsgroups: comp.arch
Subject: Re: Press Release: Intel announces 80960 architecture
Message-ID: <11026@mimsy.UUCP>
Date: 12 Apr 88 02:21:12 GMT
References: <3358@omepd> <10320@steinmetz.ge.com> <40@radix>
Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742
Lines: 55
Keywords: 80960, RISC, embedded control

I took a (very) quick peek (~30 min) through an 80960 architecture
manual that showed up in our department today.  It looks nice!  There
are 16 global registers, but one of them (g15 as I recall) is the frame
pointer, so you really get 15.  The KB stores four sets of the 16 local
registers, but you can only talk directly to the current 16, and three
of these are tied up (r0 = prev FP, r1 = prev IP?, r2 = ? forgot), so
you really have 13.  The other three sets of local registers cache the
last three stack frames; you can reach into an outer frame's registers
by executing a `flushreg' instruction to push them back out and then
diddling with the frame, but then you might as well use memory.  (Still
need flushreg sometimes.)

There are no goofy special registers beyond the usual PSL-type-thing.
IO space access is a bit muddy to me (but I skipped the section on
it).  Standard User/Supervisor separation.  256 interrupt vectors, but
8 are useless (ipl 0 vectors interrupt when you are below ipl 0, i.e.,
never) and hence suppressed, and a bunch of ipl 31 vectors are
`reserved', so you really have about 240 vectors.

There is hardware `scoreboarding' (interlocking) on the registers, so
you can ignore the pipelining, although naturally it goes faster
if you reorder.

Address space is 32 bits, but branch space is smaller.  (There is an
`anywhere' branch but most are 24 bit offsets.)  All instructions are
32 bits so this really can cover 2^26 space (I forget whether it does,
but would seem silly not to).  Instruction data types are byte,
short (word=16 bit), long (32 bit), `tripleword' (80 bit), and `quadword'
(128 bit), with signed and unsigned (`ordinal') variants for everthing
<=32 bits.  Signed store will trap if you try, e.g.,

	ldsb	addr,r3		# fetch signed byte & extend to long -128..127
	stsb	addr,r3		# (r3,addr?) store it back, no trap
	addo	r3,$256,r3	# add ordinal: now it is in 128..347
	stsb	addr,r3		# trap

As for faults, some are `indeterminate' and leave inconsistent and
hence not restartable trails, but sequencing and restartability can be
forced on a case basis (there is a `wait for pending results'
instruction) or overall (set the No Ind. Fault flag in the PSL).  The
usual set of faults turns up, although integer divide by zero is
separate from F.P. divide by zero (perhaps because FP is
architecturally optional).

FP is IEEE of course, with `plain' 32 bit real, 64 bit double, and 80
bit `extended' precisions; there are instructions galore for (e.g.)
exp, sin, cos, tan.

Best of all :-) the assembler syntax in the examples in the manual
is Vax Unix style.  .word, .align, .space directives.  No more silly
ALL CAPS STUFF!  Hooray!  :-)

[there, perhaps this will persuade mcg to elaborate :-) ]
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris