Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!amdcad!rpw3 From: rpw3@amdcad.AMD.COM (Rob Warnock) Newsgroups: comp.arch Subject: Re: Intel/MIPS Dhrystone ratio Message-ID: <24929@amdcad.AMD.COM> Date: 21 Mar 89 05:18:34 GMT References: <1552@vicom.COM> <28200290@mcdurb> Reply-To: rpw3@amdcad.UUCP (Rob Warnock) Organization: [Consultant] San Mateo, CA Lines: 128 In article <28200290@mcdurb> aglew@mcdurb.Urbana.Gould.COM writes: +--------------- | Bravo! Who needs vectored interrupts? | How often does your device know better where to interrupt to than you do? +--------------- When I first began designing with the Am29000, at first all my old habits felt cramped at "only" 4 levels of external interrupt, which don't even read a vector from the interrupting device. But I quickly realized that since the 29k has a "count-leading-zeroes" (CLZ) instruction, all you need is a magic external location you can read (can you spell 74F374?) which gives you one bit per interrupting device, and an inclusive-OR to your single interrupt line. (Who needs 4 of them, anyway?) So you load the bits, CLZ, add a table base, and jump... Given slow 8-bit I/O chips, that takes a lot less time than a vector fetch. +--------------- | But... how can interrupt (not exception) handling be made better/worse? | As an erstwhile systems programmer in a real-time OS, I know that we often | wished that interrupts could be treated exactly like processes, | going through the same priority or deadline driven scheduler. | Yet applying RISC principles to the hardware that would be needed to do | something like this, I often arrive at the conclusion that a | simple single entry point first level handler is all that is appropriate. | Everything else seems to need sequencing. +--------------- I agree. [Tutorial alert. Many of you know this already. But it's worth saying once or twice a decade, and I haven't heard it lately, so here goes...] As has been done by many of us on a variety of machines, a useful interrupt software "style" (good on many CISCs as well as RISCs) seems to be to split interrupt handlers into a "first-level"/hardware-oriented/assembly-language section, and a "second-level"/software-oriented/C-language part, with the following characteristics: - You leave the "real" hardware interrupts always enabled (especially during 2nd-level handlers, system calls, etc.). - When an interrupt occurs, all you do is clear the interrupting hardware, grab whatever really volatile data there might be, and queue up the 2nd-level handler to run -- if it's really needed ("soft"-DMA can often just stash the data in a buffer and dismiss). If there's already a 2nd-level handler running at the same or higher *2nd-level* priority [see below], you just queue up a task block, and IRET. The trick is that the *hardware* interrupt is disabled only for the brief moment when a 1st-level handler is running. - The Unix "spl??()" [Set Priority Level] routines are modified to manipulate a *software* notion of priority, which is respected by the 2nd-level routines and system-call level code (but not the hardware), and never turn off the hardware enables. Necessary exclusion with 1st-level handlers is done with *very* short interrupt disable periods, or none at all. (Treating the 1-st level handlers like "DMA devices", you can usually find a way to eliminate the IOFFs). - The interface between 1st- & 2nd-level sections is a little "task queue", sort of a light-weight "real-time scheduler". You can have a one, or any number of interrupt task queues, not necessarily related at all to whatever hardware priorities you are stuck with. - Once you start running a 2nd-level routine, you continue taking tasks off the 2nd-level queue(s) until they are empty, before restoring the CPU state and dismissing. (Since hardware*interrupts are still on, it is quite possible that more than one 2nd-level routine gets run per CPU state save.) - If you *can* get by with just one 2nd-level priority, do so. It avoids the extra state saving that comes with preempting multi-level priorities. (I know, sometimes you can't avoid it. But sometimes you can. On one system we just used the Unix "callout" queue, just setting a zero delay time if the task was for an interrupt.) The advantages of this style are these: 1. Since hardware interrupts are never turned off for long, input data overruns are easy to avoid. (...unlike some Unixes which turn off the world whenever they are searching the buffer cache!!! No wonder so many people think Unix can't do 19200 baud input. At the same time, you save a some hardware cost, since the need for real DMA hardware is lessened.) 2. The 1st-level tasks can usually be done in a few assembly instructions without saving very much CPU state; the 2nd-level tasks need a full C context, reentrant and "interruptable" -- a lot more state. Since interrupts are often "bursty", the two-level structure saves state *once* for several interrupts, a significant efficiency gain. In fact, interrupt handling gets more efficient the higher the interrupt rate. 3. Most interrupts from "character" devices can be handled entirely in the 1st-level handlers as "soft-DMA", or "pseudo-DMA", thus lessening further the number of full CPU state saves done. 4. Since hardware and software priorities now have nothing to do with each other, you can allocate priorities more rationally. For example, you may have a multi-line serial card which has one interrupt level for all the transmitters and receivers on the card; also in the system is a disk. In this case, the 1-st level serial-I/O handler will probably want to queue input (received) data to be processed at a *higher* 2nd-level priority than the disk, but queue output (transmit done) interrupts at a *lower* priority than the disk. Applying the above to a Version 7 Unix port to a 5.5 MHz 68000 (years ago), we were able to take a system which could hardly do a single 2400-baud UUCP and get it to cheerfully handle three simultaneous 9600-baud UUCPs! ...and with no change to the hardware: interrupt-per-character SIO chips. [Note: When the 29000 takes an interrupt, volatile state (PC, PS) is "frozen" in backup or shadow registers in the CPU, and execution continues (with some slight restrictions). An "IRET" restores the running process's state from the shadow registers. Instructions exist to read/write the shadow registers if a full save/restore is to be done. The very-light-weight "freeze mode" interrupt matches very nicely with the above interrupt software style. You dedicate a few protected global registers to freeze-mode processing, and *no* state has to be explicitly saved/restored unless a 2nd-level handler needs to be started in a full "C" context.] Rob Warnock Systems Architecture Consultant UUCP: {amdcad,fortune,sun}!redwood!rpw3 ATTmail: !rpw3 DDD: (415)572-2607 USPS: 627 26th Ave, San Mateo, CA 94403