Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!lll-crg!lll-lcc!qantel!ihnp4!inuxc!pur-ee!uiucdcs!uiucuxc!ccvaxa!aglew From: aglew@ccvaxa.UUCP Newsgroups: net.arch Subject: Re: Delayed Loads Message-ID: <5100136@ccvaxa> Date: Sun, 21-Sep-86 19:35:00 EDT Article-I.D.: ccvaxa.5100136 Posted: Sun Sep 21 19:35:00 1986 Date-Received: Tue, 23-Sep-86 22:24:37 EDT References: <5100133@ccvaxa> Lines: 56 Nf-ID: #R:ccvaxa:5100133:ccvaxa:5100136:000:2268 Nf-From: ccvaxa.UUCP!aglew Sep 21 18:35:00 1986 Thank you to those who responded, in net.arch or by e-mail, to my question about delayed memory accesses. All sorts of people know machines that let you execute ahead, but have an interlock. Even Goulds have these (:-). MIPS, some NCR microprocessors, and microcoded engines like Weitek's have explicit delayed loads, without interlock. And, I assume, with the usual interrupt restart problems. Do you mind if I take another stab at expressing my curiousity about delayed memory accesses? Q1: What is the success rate of code reorganization to use the delayed slots without conflict? Is it more or less successful than code reorganization for delayed branches? What are the static/dynamic rates? I understand that they are typically 90% for the first slot, 80% for the second, and so on, for delayed branches. Q2: Does anybody have special knowledge about delayed memory accesses in vector machines, particularly machines where the vector startup time is high? Q3: All of the discussion so far has been about delayed LOADS. What about delayed STORES, where you can't touch the data to be stored for a few instructions after the store instruction? Does anybody try to save a latch? I wonder if the idea of an architectural family is dead? If not, how do you reconcile it with delayed loads/branches? Start off with the longest possible delay factor, and reduce it as machines get faster? Do machines that have no interlocks on their delayed loads have strict or relaxed semantics? Ie. in the code sequence ADD r1 += r0 LOAD r0 := [memaddr] -- delay slot do MIPS et cie. permit the rearrangement LOAD r0 := [memaddr] ADD r1 := r0 where the value put in r1 is not [memaddr], but whatever was there before? There is a difference between saying "the result is not available for N cycles" and saying "the destination value is not changed for M cycles". (Oh, more of a personal self-development, I wouldn't bother the net but I'm not quite sure who to ask, question: NCR has built some interesting machines. Where can I get information, spec sheets, data books on them?) Andy "Krazy" Glew. Gould CSD-Urbana. USEnet: ihnp4!uiucdcs!ccvaxa!aglew 1101 E. University, Urbana, IL 61801 ARPAnet: aglew@gswd-vms