Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!ucbvax!ucbarpa.Berkeley.EDU!melvin From: melvin@ucbarpa.Berkeley.EDU (Steve Melvin) Newsgroups: comp.arch Subject: Re: Instruction (dis)continuation ( Summary: For high performance, specify EXPLICITLY when reads have side-effects Message-ID: <31361@ucbvax.BERKELEY.EDU> Date: 18 Sep 89 07:00:45 GMT References: <2353@oakhill.UUCP> <261500010@S34.Prime.COM> <34701@apple.Apple.COM> <642@unicads.UUCP> <1516@atanasoff.cs.iastate.edu> <31316@ucbvax.BERKELEY.EDU> <27633@winchester.mips.COM> Sender: usenet@ucbvax.BERKELEY.EDU Reply-To: melvin@ucbarpa.Berkeley.EDU.UUCP (Steve Melvin) Organization: University of California, Berkeley Lines: 64 In article <27633@winchester.mips.COM> mash@mips.COM (John Mashey) writes: >Note that this whole issue is not (just) a hardware issue, it's a: > hardware instruction-level > hardware micro-architecture > language definition > compiler technology >and operating system >issue; and it's IMPORTANT to understand how these all fit together. >... >Attributes that make life simpler in machines that use memory-mapped I/O: >1) Load/store architecture, >... Your point is well taken, there are many sides to this issue, but I don't think it's fair to say that load/store *architectures* make life simpler for systems programmers; using simple loads and stores is pretty much of a requirement as has been pointed out, regardless of whether memory to memory instructions exist. But which instructions are used and what restrictions need to be placed on them is secondary to the real issue here. The bottom line for an I/O instruction is that it represents a synchronization point from the perspective of the hardware. That is, all unconfirmed operations have to be verified before the I/O operation can take place. All predicted branches have to be confirmed, all pending memory reads and writes have to at least be translated to verify that they can be completed and all operations that can generate exceptions have to be executed. Generally this means that the entire pipeline has to be drained. This is a simple fact, there is no way around it (at least not as long as reads have side-effects and there is no "undo" function.) However, if the mere fact that you have to handle these synchronization points correctly (which are few and far between) slows down the other 99.9% of your code, something is wrong. In low concurrency machines, memory mapped I/O isn't a big deal in this regard because it doesn't slow down non-I/O code. Just go ahead and use ordinary instructions with ordinary virtual addresses (with appropriate restrictions on number and type of operands, as has been discussed) and let the hardware figure out that it has an I/O instruction when it sees the address. However, in high concurrency microarchitectures which execute multiple operations per cycle and in an order determined at run-time (these processors are coming, BTW) there has to be a more explicit way to let the hardware know about an I/O instruction. Simply using ordinary instructions with ordinary addresses, and expecting the hardware to do the right thing won't work. You can't expect to get maximal speedup on the code that doesn't know or care about I/O if the hardware has to guarantee that it doesn't trip across a synchronization point in the middle of some basic block that it is executing out-of-order. There are many possible ways to do this and memory mapped I/O could still be incorporated into such a solution. My only point is that presenting a memory model to the hardware in which reads don't have side-effects is of critical importance in high concurrency designs. (Of lesser importance but also of value is the property of multiple writes (i.e. a write of an incorrect value can take place and the correct value can be later written.)) I think that the increase in performance will win out whatever reduction in convenience this implies and people will figure out whatever has to be figured out at the higher levels in order to allow the hardware to make these assumptions unless explicitly told otherwise (i.e. BEFORE address translation: part of the opcode, surrounded by special instructions, etc.). ---- Steve Melvin University of California, Berkeley ----