Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!ucbvax!ucbarpa.Berkeley.EDU!melvin From: melvin@ucbarpa.Berkeley.EDU (Steve Melvin) Newsgroups: comp.arch Subject: Re: Instruction (dis)continuation ( Summary: Yes, but consider the problems for a pipelined implementation. Message-ID: <31316@ucbvax.BERKELEY.EDU> Date: 15 Sep 89 09:10:26 GMT References: <2353@oakhill.UUCP> <261500010@S34.Prime.COM> <34701@apple.Apple.COM> <642@unicads.UUCP> <1516@atanasoff.cs.iastate.edu> Sender: usenet@ucbvax.BERKELEY.EDU Reply-To: melvin@ucbarpa.Berkeley.EDU.UUCP (Steve Melvin) Organization: University of California, Berkeley Lines: 58 In article <1516@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu.UUCP (John Hascall) writes: > I guess I fail to see the problem. I agree that for many I/O devices > re-reading a device-register is a bad thing. What I don't see is how > this can happen except when: > > a) you have an instruction (or instr. set) which is restarted > *and* > b) you have an instruction that reads from two (or more) > operands (and the I/O location is not the last one read?). > > Are there machines which can get a page-fault acessing a memory-mapped > I/O device register location?? (surely not!) > > Examples using the VAX instruction set (write operands are rightmost): > > MOVW IO_DEV_CSR,R0 ; no problem: no page faults in I/O space > ; (even if MOVW was a restarted instr) The reason this works and seems not to be a problem is that the hardware designers have gone to some trouble to make it work. Consider what happens at the microarchitecure level and I think you'll agree that it really is a problem. Let's stick with this instruction and talk about the VAX 8600 implementation. When the instruction unit sees the opcode for the MOVW instruction, the execution unit could still be two instructions behind. What happens is the following: the instruction unit decodes the first operand and generates a virtual address memory read request to the memory unit. The memory unit then translates this virtual address (assuming it's not busy with another request) into a physical address using the translation buffer. Assuming a TB hit, then at this point the memory unit recognizes that it is an I/O address (the I/O space is reconizable from the physical address, in this case if bit 29 (the MSB) is high). Since the execution of all previous instructions has not yet completed at this point, the memory unit disregards the request and waits for it to be re-issued when the exeuction unit catches up. No further pre-fetching can occur and the pipeline is drained. The point is that if a previous instruction faults, let's say a page fault on a destination write, which will not be detected until the very end of the instruction, the read for the MOVW must not have taken place. If the I/O instruction had been recognizable from the opcode (as in my opinion it should be), the microarchitects could have designed a simpler memory unit that assumed any prefetch read from a non-I/O instruction is OK. Also consider that this is a simple example, in a more heavily pipelined machine, with perhaps even out-of-order prefetching of operands, it gets even harder to guarantee that these reads don't occur, it basically means that address translation for all reads must occur in order with a microtrap mechanism to back out when an I/O address is encountered. Since the person writing the device driver or other code that touches I/O registers generally knows which variables map to I/O space, why not just have them use a different instruction? Then, the microarchitecture can much more cleanly enter and exit this synchronization point. Steve Melvin University of California, Berkeley melvin@arpa.Berkeley.EDU ...!ucbvax!melvin