Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site redwood.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!godot!harvard!seismo!hao!hplabs!hpda!fortune!redwood!rpw3 From: rpw3@redwood.UUCP (Rob Warnock) Newsgroups: net.arch,net.micro.16k,net.micro.68k Subject: Re: Using top eight bits of pc on 68000 (LONG reply) Message-ID: <191@redwood.UUCP> Date: Mon, 18-Mar-85 08:53:44 EST Article-I.D.: redwood.191 Posted: Mon Mar 18 08:53:44 1985 Date-Received: Thu, 21-Mar-85 03:29:39 EST References: <306@calgary.UUCP> Organization: [Consultant], Foster City, CA Lines: 156 Xref: watmath net.arch:1009 net.micro.16k:292 net.micro.68k:684 Radford Neal proposed a specific limited use of the high bits of a 68000's address in interrupt vectors to identify trap vector numbers (without using the otherwise necessary table of interrupt "stub" routines, one per trap). He then was flamed for this use, and replied: +--------------- | ... I pointed out that ... [this technique] ... produces code that | is MORE portable to a 68020 than avoiding this does... | I still haven't received any answer from those who treat this as a religious | rather than engineering issue. +--------------- O.k., I wouldn't say I treat this as a "religious" issue, but here goes... I am one of those who reacted QUITE strongly against (general) use of the high bits early in the discussion. I consider Neal's particular technique might be a valid use of the high bits of a 68000, under certain circumstances. (I am somewhat embarrassed that I didn't think of it myself, as I already knew of a similar technique used on the PDP-11, in which device "unit numbers" were encoded in the condition-codes of an interrupt PSW.) However, I think it loses (VERY slightly) on the strict technical merits when compared to a commonly-used alternative. There are really two separate levels to be examined, however, and we shouldn't confuse them. "PART ONE" is the "safety" of low-level machine-dependent code of this general type; "PART TWO" is the engineering merit of this specific proposal as compared to other possible ways of accomplishing the same task. PART ONE: In the first area, Neal's proposal fares well: 1. It is NOT spread through user applications programs, but is isolated in one place -- the interrupt "head end" -- and can be changed without invalidating user code. (In particular, it does NOT cause user files to be generated with these bits on which cause trouble later on when those programs or data files are moved to another machine). 2. It can help to emulate the 68010 interrupt stack handling. Thus the actual high-level device interrupt routines can be made MORE machine independent. 3. Interrupt "head end" code is NOT generally portable across various similar machines, thus this code probably has to be re-written anyway for each new environment. Parnas has defined a "module" to be that portion of a system which hides a design decision from the rest of the system, such that that decision can be changed without affecting anything outside the module. The "interrupt head" code on most systems is principly concerned with hiding the ugly low-level details of interrupts from the rest of the system, and thus can (if designed properly) qualify as a "module" in the Parnas sense. The use of a specific machine-dependent optimization inside such a module is an engineering decision, which must include development cost, performance, maintenance costs, etc. In environments in which there is little low-level supervision of the actual development process and where short-term concern for "get it done" is more important than long-range quality, the above hack is UNSAFE, for its use could not be encapsulated in the "interrupt head" -- neither technically nor organizationally. Other portions of the system would begin using the high bits, and there would be no checks-and-balances to protect against a later "disaster" (of the "S/360" sort noted in several postings on this topic). In an environment in which such engineering policy decisions are well managed so as to encapsulate "permission to hack" within the organization (so that "organizational modules" do not compromise "design modules"), I consider the above optimization (use of the high bits) could be "safe". PART TWO: However, moving to the second level of the discussion, the engineering merit of this specific low-level design, I have to say I don't agree with Mr. Neal (despite the fact that it IS a neat hack). The commonly used alternative method is to have each vector point to a separate "jsr" instruction, which in turn jumps to the actual interrupt code. (I have used this method before in a UNIX port to a 68000.) While it appears ugly to those who wish Motorola had given us 68010-style interrupts in the first place, using the "jsr" method on all 256 vectors requires only 2048 bytes (even if each "jsr" uses long addressing and is padded to 8 bytes to speed the computation of the vector number), an amount which doesn't cost very much (less than $0.50 worth of 64K dynamic RAMs, at today's prices). Thus there is no overwhelming cost savings. Neal's scheme offers a speed advantage if the only use of the vector numbers is to identify "wild" interrupts, and is not used for "normal" interrupts (each device vectors to a separate location, not to a common interrupt routine). However, in the more usual case where heavy use is made of the actual vector number (such as a common interrupt routine which computes which unit of a multi-unit controller interrupted), I claim that Neal's method is slower. Extracting the vector number from the high bits is (slightly) slower than extracting it from the low bits (where it is found in the "jsr" case). Further, the "jsr" method leaves the vector number on the stack, where it is safe from any accidental alteration. Neal's method leaves it in the high bits of the PC, so the PC value must be saved first thing before any jumps or branches are executed (which would clear the high bits and lose the vector information). This could require extra code at the beginning of each interrupt routine. (While this could be done with an immediate "jsr" to a common routine which extracts and returns the vector number, the call of the common routine wipes out any speed advantage the "high-PC" method might have had over the "jsr" method.) I would want to make sure that the high-PC bits WERE cleared before any "outside" code got called, to avoid propagating those bits outside the interrupt head. This might cost an extra instruction (especially if the interrupt head "module" was in fact a macro call placed at the beginning of each interrupt routine). (Just so I am not misunderstood -- in a tightly coded "interrupt head" design, the differences between the two methods will be very small, a few microseconds at most.) CONCLUSION: My opinion is that in the particular case of the "interrupt head", the use of the high address bits may be justified on an extremely small cost-sensitive systems with tight constraints on memory size or extremely tight interrupt latency requirements (such as a ROM-based embedded controller), especially since the encapsulation can be made quite clean. NOTE: This opinion does NOT necessarily extend to other uses of the high address bits of a 68000. In particular, I feel that using those bits to encode the types of address pointers in user code is a BIG mistake. (Time will tell...) But for general-purpose systems the alternative "jsr" method is so close to the "high-PC" method in cost and performance (and may actually be BETTER in performance) that the use of this "neat hack" is not worth the risk that the use of the high bits will spread to other projects and cause problems. Therefore, I don't think I would ever use the "high-PC" method. p.s. With octade-aligned "jsr" stubs you can use the "wasted" two bytes for other things as well: (1) each "jsr" can be followed by an "rte", so that no "rte" need appear in the "C" code; or (2) each "jsr" can carry with it two bytes of "parameters", such as a table index of the device-data-block associated with each interrupt; or (3) instead of doing address arithmetic on the return address from the "jsr", you can put the vector number following the "jsr", and pick it up by indirecting through the "return address" which points to it. (Etc., &c. Lots of neat hacks... ;-} ) p.p.s. On "small" systems, the "jsr" table need take only 1024 bytes, as you can use short (16-bit) absolute addressing. Also, the short "jsr" is slightly faster than when full 32-bit addresses are used. Rob Warnock Systems Architecture Consultant UUCP: {ihnp4,ucbvax!dual}!fortune!redwood!rpw3 DDD: (415)572-2607 USPS: 510 Trinidad Lane, Foster City, CA 94404