Path: utzoo!attcan!uunet!husc6!rutgers!apple!amdcad!weitek!dms!albaugh
From: albaugh@dms.UUCP (Mike Albaugh)
Newsgroups: comp.arch
Subject: OISC was Re: ZISC computers
Message-ID: <579@dms.UUCP>
Date: 17 Nov 88 21:13:06 GMT
References: <3562@hubcap.UUCP>
Organization: Atari Games Inc., Milpitas, CA
Lines: 187


	I wanted to point out that a machine with no opcode has not
zero, but one instruction(s). (2**0 = 1)

	As several people have noted, the idea of using addresses to
"cheat" (refer to function units) is well represented in the literature.
A recent article in ACM Computer Architecture News even has the audacity
to call this approach "The Ultimate RISC". I might be persuaded that it
is the "ultimate _useful_ RISC", but machines such as Van der Poel's
(article <3562@hubcap.UUCP>, by mark@hubcap.UUCP (Mark Smotherman) )
are arguably much more RISCy.

	A few years ago (Before I was personally on UseNet), I had a
friend email (Maybe post?) an earlier version of the following, In
which I present my own "ultimate" RISC. Since mine also has one (or two,
but not several) "magic" location, it may not be truly "ultimate". I
feel that it is closer than the various two and three address variations.
I originally intended to actually wire-wrap the first version, and may
yet build the current version, but kids and a house have changed my
priorities. I do have a C simulator, and some useful macros.

By the way, the GRI-??? mentioned in <20743@apple.Apple.COM> by 
baum@apple.com was manufactured by George Risk Industries, so it may
have been ahead of its time in nomenclature too :-)

And now, a re-run:

Random thoughts on the Ultimate Reduced Instruction Set Computer

	One possible implementation is a machine which has the instruction:
Reverse Subtract and skip on Borrow. That is, every instruction has no opcode
but does have an address. The accumulator is subtracted from that address and
the result left both there and in the accumulator. If the subtraction caused
a borrow (i.e. no carry out of the ALU), the subsequent instruction is skipped.
The only handwaving required is for the PC to live in a fixed location
(say 0, ala PDP-5). A very naive control unit which does one instruction per
6 clock cycles can be implemented, a slightly less naive one can do it in
5 (sometimes 4), and shadowing the PC can get it to 3 (sometimes 2).

	Some macros for the machine, to implement more common instructions are:

	macro CLR p1
	p1			; p1,acc <- p1-acc (may skip)
	p1			; p1,acc <- 0 (cannot skip)
	p1			; p1,acc <- 0 (cannot skip)
	ENDM
;	The third instruction is not redundant, since we have no way of knowing
; whether the first will skip.

	macro LOAD p1 p2
	CLR p2			; We need to provide a sacrificial location
	p1			; acc,p1 = p1-0 (cannot skip)
	ENDM

	macro STORE p1 z	; z must be a pre-zeroed location
	z			; acc,z = -acc (will skip unless acc=0)
	z			; acc,z = 0 (a nop in this case)
	CLR p1
	z			; acc = z (-acc from entry, cannot skip)
	p1			; acc,p1 = entry acc (will skip unless 0)
	z			; another NOP to fill the skip safely.
	ENDM

;	The necessity of providing a sacrificial location for LOAD and
; a guaranteed zero (which is destroyed) for STORE imply that the
; "one-address" machine is an inappropriate paradigm. The "two-address"
; machine yields a substantial improvement:

	macro MOV p1 p2 z
	CLR p2
	z
	z			; Also clears z, but shorter 'cuz we can't skip
	p1			; acc,p1 = p1
	z			; acc,z = -p1		(will skip unless 0)
	z			; in case p1 was 0
	p2			; acc,p2 = p1		(will skip unless 0)
	z			; in case p1 was 0
	ENDM

	macro SUB p1 p2 p3	; (p3=p1-p2), (p2=p2-p1)
	CLR p3
	p1			; acc,p1 = p1 (cannot skip)
	p2			; acc,p2 = p2-p1 (may skip)
	p3			; acc,p3 = p1-p2 (will skip unless p1=p2)
	p3			; acc,p3 = p1-p2 (cannot skip)
	ENDM

The analysis of that one is a bit tricky. If p2 is less than p1, the p2
instruction skips to the second p3, which cannot skip again, since 
(p2<p1) IFF (p1>p2). If p2 is greater than p1, the p2 will not skip, but the
first p3 will, by similar reasoning. If p2=p1, all instructions are executed
and p2 = p3 = 0.

	macro JMP l
	CLR	Z
	lit	next-l		;acc=(next-l)
next	0			;pc (which is in 0) = next-(next-l) = l
	ENDM

Actually, the value of the literal needs to be adjusted for the skip which
will occur if l is at a lower address than next, but I didn't want to clutter
the psuedo-macro with conditionals.

	macro ZAJMP l z		; if acc=0		if acc<>0
	z			; 0 (noskip)		-acc (must skip)
	lit	l-next		; l-next (noskip)	----
	z			; next-l (mustskip)	0 (noskip)
	z			; -----			0 (nop)
next	0			; JMP l			nop
	ENDM

	macro ZJMP t l z	;Jump to l if t=0
	LOAD	t z
	ZAJMP	l z
	ENDM

    The "lit" function is similar to the # (immediate) operator on such
machines as have an immediate mode. If you don't have such a mode (many
early machines didn't), you typically create a "literal pool", for all
the constants you are going to use. Such constants are shared by all the
instructions that need to refer to that particular bit pattern. Since a
careless use could alter the pattern, this can be a source of endless
"amusement" during debuging. The creation of a literal pool on a macro
assembler that doesn't have such a feature (such as DEC MACRO-11, because it
wouldn't use it in its "native tongue") is left as an exercise to the reader,
for now. I know how _I_ do it, but will leave it as an exercise to the
reader for any particular machine.

	Robert Murdoch (of Intel, Hi Doch :-)) suggested a refinement of 
Lyle Rains (of Atari Games, my boss's boss) idea that byte addressability
of memory would make character operations easier. Doch's idea was to go all
the way and make the memory bit-addressable. A whole word is still
referenced and modified in each cycle, but that word can start on any bit
boundary. Since the extra 4 bits would take a real bite out of the address
space of a 16-bit machine, this makes most sense with a 32 bit machine. The
complexities of the hardware get intense (if we want to go very fast), but
right shifts and field extraction become trivial.

	Interrupts can be handled by forcing references to 0 to go to "1"
instead. The quotes around "1" are because a bit addressable machine would
have its second word at address 16 (or 32). Anyway, just oring the interrupt
control flag with the appropriate address bit on a reference to 0 will do the
trick. The following software is needed to save and restore the "processor
status". accs and accs2 are two pre-zeroed locations, to recieve the contents
of the acc, whether or not the instruction before the interrupt caused a skip.
accr and accr2 pre-set by the IRQ routine, and the return is before the entry
so a simple delay on the reset of the IRQ flag accomplishes return
There are three possible input states to re-construct:
			 acc==0, !skip;	acc<>0, !skip;	acc<>0, skip

IRQRTN:	clr	IFRESET	; Flag hardware, action delayed 2 cycles past last ref.
	accr			0	0		-inacc
	accr2			0 	inacc 		0->inacc skip

; These get set up by:
INTEN:	accs		;	0	-inacc s	0
	accs2		;	0	---		-inacc s	
	accr		;	???	???		---
	accr		;	0	--?		???
	accr		;	0	0		--?
	accr		;	0	0		0
	accr2		;	???	???		???
	accr2		;	0	0		0
	accs		;	0	-inacc		0
	accr2		;	0	inacc s		0
	accs2		;	0	---		-innac
	accs		;	0	0		inacc s
	accr		;	0	0		---
	accr		;	0	0		-inacc s
	accs		;	0	0		---
	accs		;	0	0		0
	accs2		;	0	0		innac
	accs2		;	0	0		0
; At this point, accs and accs2 are pre-zeroed for next IRQ, accr and accr2
; are set up for return. Do body of IRQ routine...

	jmp	IRQRTN	; Jump up to return point

; for 28 cycles interrupt overhead...
; The clear of accs2 could be merged with the jump to IRQRTN to save two cycles

Isn't Science wonderful...

| Mike Albaugh (albaugh@dms.UUCP || {...decwrl!turtlevax!}weitek!dms!albaugh)
| Atari Games Corp (Arcade Games, no relation to the makers of the ST)
| 675 Sycamore Dr. Milpitas, CA 95035		voice: (408)434-1709
| The opinions expressed are my own (Boy, are they ever)