Path: utzoo!censor!geac!jtsv16!uunet!tut.cis.ohio-state.edu!ucbvax!agate!bionet!ames!ll-xn!mit-eddie!uw-beaver!tektronix!sequent!jjb From: jjb@sequent.UUCP (Jeff Berkowitz) Newsgroups: comp.arch Subject: Re: delayed branch Message-ID: <19870@sequent.UUCP> Date: 9 Aug 89 06:16:56 GMT References: <828@eutrc3.urc.tue.nl> Reply-To: jjb@sequent.UUCP (Jeff Berkowitz) Organization: Sequent Computer Systems, Inc Lines: 52 In article <828@eutrc3.urc.tue.nl> rcpieter@rc4.urc.tue.nl writes: >Just wondering--- > - What happens on existing processors which use delayed branches when the >instruction put in the branch instruction's shadow is also a branch? If you haven't thought this through, it's quite amusing. Given such a machine, the typical CISC code (erroneous in this case, of course) - looptop: ... jsr subr br looptop causes exactly *one* instruction at "subr" to be executed before going back to "looptop". (When executing the jsr, you fetch the branch; when executing the branch, you fetch @subr; when executing @subr, you fetch at looptop, etc). On one machine in my past, the architects were disturbed enough by this behavior to "fix" it in hardware: if a branch occurred in the shadow, the machine automatically converted the "shadowed" branch into a noop! If the first branch happened to be a jsr, as in the example above, the machine would not only "noop" it but would remember that it should be the return addrss of "subr", so it would be executed when you got back. This allowed code like jsr syscall_enter bcs error # executed on return from syscall_enter to work "just like a CISC". The machine actually had two instructions in the "shadow" (we called these "trailers"). So there were actually three possible return addresses of any jsr instruction - the next instruction, if it was a branch, jump, or jsr else the instruction after that, if *it* was a branch, jump, or jsr else the instruction after *that*. The instruction pipeline had to determine which to "push" on-the-fly. Was this complexity worth it? Well, presumably it improved code density. I am not aware of any measurements. It made breakpoint debugging a serious pain; you needed a zillion types of breakpoint instructions so that if you set a breakpoint on a branch that was the second trailer of a jsr, the breakpoint instruction itself would *also* look like a branch. Anyway the machine was not successful, in part because of the complexity of trying to implement a lot of little rules like this in hardware. -- Jeff Berkowitz N6QOM uunet!sequent!jjb Sequent Computer Systems Custom Systems Group