Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!hplabs!amdcad!cayman!tim From: tim@cayman.amd.com (Tim Olson) Newsgroups: comp.arch Subject: Re: delayed branch Message-ID: <26716@amdcad.AMD.COM> Date: 11 Aug 89 15:02:44 GMT References: <828@eutrc3.urc.tue.nl> <26667@amdcad.AMD.COM> <26676@amdcad.AMD.COM> <8266@hoptoad.uucp> Sender: news@amdcad.AMD.COM Reply-To: tim@amd.com (Tim Olson) Organization: Advanced Micro Devices, Austin, TX Lines: 57 Summary: Expires: Sender: Followup-To: In article <8266@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: | tim@cayman.amd.com (Tim Olson) wrote: | > Does anyone else know of other processors with such restrictions? | | I'm surprised that nobody mentioned the SPARC. Well, from the previous postings and email conversations, it appears that nearly every RISC processor besides the Am29000 has a restriction on what can go in a branch delay slot, including SPARC, MIPS, 88000, i860, and ROMP. Most of the restrictions are advisory (don't do this; the result is undefined), but the ROMP has hardware to detect and trap this condition. One interesting thing to think about if control transfers are allowed in branch delay slots is how a delay-slot call should work: loop: . . jmp loop call lr0, function exit: . . Calls are typically defined in RISC processors to save the return address in a register. Since calls themselves have delay slots, the return address is normally the second instruction after the call. The action that a delay-slot call takes depends upon how the return address is calculated in the processor. It could either be the address of the call + 2 (words), or the address of the call's delay slot instruction + 1. These normally result in the same value, but if the call is itself in a delay slot, they work differently: ret <- call+2 ret <- call_delay+1 jmp loop jmp loop call lr0, function call lr0, function . . In the former case, the jmp/call pair acts as a visit to the jmp's target, and does not execute the instruction at exit (it substitutes the jmp's target for the call's delay slot). In the later case, the jmp/call pair continues the loop, executing the first instruction of the loop just before the call target is executed, and returns to the second instruction in the loop. The Am29000 exhibits the second behavior. -- Tim Olson Advanced Micro Devices (tim@amd.com)