Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 6/24/83; site redwood.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!godot!harvard!seismo!hao!hplabs!hpda!fortune!redwood!rpw3
From: rpw3@redwood.UUCP (Rob Warnock)
Newsgroups: net.arch,net.micro.16k,net.micro.68k
Subject: Re: Using top eight bits of pc on 68000 (LONG reply)
Message-ID: <191@redwood.UUCP>
Date: Mon, 18-Mar-85 08:53:44 EST
Article-I.D.: redwood.191
Posted: Mon Mar 18 08:53:44 1985
Date-Received: Thu, 21-Mar-85 03:29:39 EST
References: <306@calgary.UUCP>
Organization: [Consultant], Foster City, CA
Lines: 156
Xref: watmath net.arch:1009 net.micro.16k:292 net.micro.68k:684

Radford Neal proposed a specific limited use of the high bits of a 68000's
address in interrupt vectors to identify trap vector numbers (without using
the otherwise necessary table of interrupt "stub" routines, one per trap).

He then was flamed for this use, and replied:
+---------------
| ... I pointed out that ... [this technique] ...  produces code that
| is MORE portable to a 68020 than avoiding this does...
| I still haven't received any answer from those who treat this as a religious
| rather than engineering issue.
+---------------

O.k., I wouldn't say I treat this as a "religious" issue, but here goes...


I am one of those who reacted QUITE strongly against (general) use of the
high bits early in the discussion. I consider Neal's particular technique
might be a valid use of the high bits of a 68000, under certain circumstances.
(I am somewhat embarrassed that I didn't think of it myself, as I already
knew of a similar technique used on the PDP-11, in which device "unit numbers"
were encoded in the condition-codes of an interrupt PSW.) However, I think
it loses (VERY slightly) on the strict technical merits when compared to a
commonly-used alternative.

There are really two separate levels to be examined, however, and we shouldn't
confuse them. "PART ONE" is the "safety" of low-level machine-dependent code of
this general type; "PART TWO" is the engineering merit of this specific proposal
as compared to other possible ways of accomplishing the same task.

PART ONE:

In the first area, Neal's proposal fares well:

1. It is NOT spread through user applications programs, but is isolated
   in one place -- the interrupt "head end" -- and can be changed without
   invalidating user code. (In particular, it does NOT cause user files
   to be generated with these bits on which cause trouble later on when
   those programs or data files are moved to another machine).

2. It can help to emulate the 68010 interrupt stack handling. Thus the actual
   high-level device interrupt routines can be made MORE machine independent.

3. Interrupt "head end" code is NOT generally portable across various
   similar machines, thus this code probably has to be re-written anyway
   for each new environment.

Parnas has defined a "module" to be that portion of a system which hides a
design decision from the rest of the system, such that that decision can be
changed without affecting anything outside the module. The "interrupt head"
code on most systems is principly concerned with hiding the ugly low-level
details of interrupts from the rest of the system, and thus can (if designed
properly) qualify as a "module" in the Parnas sense. The use of a specific
machine-dependent optimization inside such a module is an engineering decision,
which must include development cost, performance, maintenance costs, etc.

In environments in which there is little low-level supervision of the actual
development process and where short-term concern for "get it done" is more
important than long-range quality, the above hack is UNSAFE, for its use
could not be encapsulated in the "interrupt head" -- neither technically nor
organizationally. Other portions of the system would begin using the high bits,
and there would be no checks-and-balances to protect against a later "disaster"
(of the "S/360" sort noted in several postings on this topic).

In an environment in which such engineering policy decisions are well
managed so as to encapsulate "permission to hack" within the organization
(so that "organizational modules" do not compromise "design modules"),
I consider the above optimization (use of the high bits) could be "safe".

PART TWO:

However, moving to the second level of the discussion, the engineering
merit of this specific low-level design, I have to say I don't agree
with Mr. Neal (despite the fact that it IS a neat hack). The commonly
used alternative method is to have each vector point to a separate
"jsr" instruction, which in turn jumps to the actual interrupt code.
(I have used this method before in a UNIX port to a 68000.) While it
appears ugly to those who wish Motorola had given us 68010-style
interrupts in the first place, using the "jsr" method on all 256
vectors requires only 2048 bytes (even if each "jsr" uses long
addressing and is padded to 8 bytes to speed the computation of the
vector number), an amount which doesn't cost very much (less than
$0.50 worth of 64K dynamic RAMs, at today's prices). Thus there is
no overwhelming cost savings.

Neal's scheme offers a speed advantage if the only use of the vector
numbers is to identify "wild" interrupts, and is not used for "normal"
interrupts (each device vectors to a separate location, not to a common
interrupt routine).

However, in the more usual case where heavy use is made of the actual
vector number (such as a common interrupt routine which computes which
unit of a multi-unit controller interrupted), I claim that Neal's
method is slower. Extracting the vector number from the high bits is
(slightly) slower than extracting it from the low bits (where it is found
in the "jsr" case).

Further, the "jsr" method leaves the vector number on the stack, where
it is safe from any accidental alteration. Neal's method leaves it in
the high bits of the PC, so the PC value must be saved first thing
before any jumps or branches are executed (which would clear the high
bits and lose the vector information). This could require extra code at
the beginning of each interrupt routine. (While this could be done with
an immediate "jsr" to a common routine which extracts and returns the
vector number, the call of the common routine wipes out any speed
advantage the "high-PC" method might have had over the "jsr" method.)

I would want to make sure that the high-PC bits WERE cleared before
any "outside" code got called, to avoid propagating those bits outside
the interrupt head. This might cost an extra instruction (especially
if the interrupt head "module" was in fact a macro call placed at the
beginning of each interrupt routine).

(Just so I am not misunderstood -- in a tightly coded "interrupt head"
design, the differences between the two methods will be very small,
a few microseconds at most.)

CONCLUSION:

My opinion is that in the particular case of the "interrupt head",
the use of the high address bits may be justified on an extremely
small cost-sensitive systems with tight constraints on memory size or
extremely tight interrupt latency requirements (such as a ROM-based
embedded controller), especially since the encapsulation can be made
quite clean.

	NOTE: This opinion does NOT necessarily extend to other uses of
	the high address bits of a 68000. In particular, I feel that using
	those bits to encode the types of address pointers in user code
	is a BIG mistake. (Time will tell...)

But for general-purpose systems the alternative "jsr" method is so close to
the "high-PC" method in cost and performance (and may actually be BETTER
in performance) that the use of this "neat hack" is not worth the risk that
the use of the high bits will spread to other projects and cause problems.
Therefore, I don't think I would ever use the "high-PC" method.

p.s. With octade-aligned "jsr" stubs you can use the "wasted" two bytes for
other things as well: (1) each "jsr" can be followed by an "rte", so that
no "rte" need appear in the "C" code; or (2) each "jsr" can carry with it
two bytes of "parameters", such as a table index of the device-data-block
associated with each interrupt; or (3) instead of doing address arithmetic
on the return address from the "jsr", you can put the vector number following
the "jsr", and pick it up by indirecting through the "return address" which
points to it. (Etc., &c. Lots of neat hacks... ;-} )

p.p.s. On "small" systems, the "jsr" table need take only 1024 bytes, as
you can use short (16-bit) absolute addressing. Also, the short "jsr" is
slightly faster than when full 32-bit addresses are used.


Rob Warnock
Systems Architecture Consultant

UUCP:	{ihnp4,ucbvax!dual}!fortune!redwood!rpw3
DDD:	(415)572-2607
USPS:	510 Trinidad Lane, Foster City, CA  94404