Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ames!pasteur!ucbvax!hplabs!hp-sdd!ncr-sd!se-sd!lord
From: lord@se-sd.NCR.COM (Dave Lord )
Newsgroups: comp.arch
Subject: Re: delayed branch
Message-ID: <1996@se-sd.NCR.COM>
Date: 2 Aug 89 18:10:14 GMT
References: <2246@taux01.UUCP>
Reply-To: lord@se-sd.UUCP (Dave Lord (SSP))
Organization: NCR Corporation, SE-San Diego
Lines: 19

In article <2246@taux01.UUCP> cdddta@tasu76.UUCP (David Deitcher) writes:
>"Delayed branch" is a technique used by RISC machines to make use of the
>extra cycle needed to calculate branch targets. The compiler will put
>an instruction after the branch to be executed by the CPU while the
>branch target is being calculated. Does anyone have information as to
>how often the compiler is able to put a useful instruction after the
>branch as opposed to filling it with a NOP?

You mean in theory or in real life? :-) I've looked at code generated
by three different compilers for the 88K (GreenHills, GNU, & LPI) and
I don't believe any of them EVER put a useful instruction in the
delayed branch slot. Admittedly the 88K is still pretty new and these
were all early compilers. I suspect that the reason
the delayed branch slots are not used is that the register allocators
are not smart enough to hold a register after a branch. 
Hopefully this will change. Anyone have
any idea what percentage of typical code is branches? It would be
interesting to know how much performance could be gained 
by filling those slots.