Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!mcsun!ukc!cam-cl!news From: cet1@cl.cam.ac.uk (C.E. Thompson) Newsgroups: comp.arch Subject: Re: Can old architectures run fast? Message-ID: <1991May9.221145.24087@cl.cam.ac.uk> Date: 9 May 91 22:11:45 GMT References: <8283@uceng.UC.EDU> <7628@auspex.auspex.com> <8324@uceng.UC.EDU> <1991May05.174756.9026@iecc.cambridge.ma.us> <9105070005.AA24446@iecc.cambridge.ma.us> Reply-To: cet1@cl.cam.ac.uk (C.E. Thompson) Organization: U of Cambridge Comp Lab, UK Lines: 39 In article <9105070005.AA24446@iecc.cambridge.ma.us> johnl@iecc.cambridge.ma.us (John R. Levine) writes: > {in re IBM 360 architecture} > >No question. it's not cheap. Some of the stuff they have to do is extremely >gross. The worst example is an execute instruction which points to a >translate-and-test (TRT) instruction. The TRT has two memory operands and >looks up each byte of the first using the second as the lookup table until >it finds a table entry that's non-zero. This means that the length of the >first operand depends on its contents. 360 instructions are not >continuable, so since the execute, the TRT, and both operands can each >potentially span a page boundary, the CPU can need to touch as many as 8 >pages. To tell whether it needs the 8th page it does a "trial execution" of >the instruction that doesn't store any results before actually doing the >instruction. There is something seriously wrong with this example. TRT doesn't *modify* storage, so rolling back the state of the CPU on a paging exception is almost trivial. You can't predict how early the TRT will stop, and so which pages will be touched, but you don't *need* to. It is no worse in this respect than a CLC instruction. A straight TR instruction is actually somewhat worse, because it does modify storage, and one can't tell without trial execution whether the whole of the 256-byte translation table needs to be translatable. The notoriously awful instruction is EDMK, as pointed out in another posting. Anyway, all these problems are finessed by the general rollback mechanisms of IBM 308x and 3090 series machines. > There's even more internal hair than that, since the 3090 has >lots of fault-tolerance hardware and takes microcode checkpoints several >places in a complex instruction. Even with a non-interruptible instruction? (obviously this happens for interruptible instructions like MVCL and CLCL.) Do you know this for a fact (about 3090s, specifically)? It rather suprises me. Chris Thompson JANET: cet1@uk.ac.cam.phx Internet: cet1%phx.cam.ac.uk@nsfnet-relay.ac.uk