Path: utzoo!attcan!uunet!lll-winken!lll-tis!ames!ncar!noao!nud!tom
From: tom@nud.UUCP (Tom Armistead)
Newsgroups: comp.arch
Subject: Re: Branch Delay Annullment
Message-ID: <1081@nud.UUCP>
Date: 15 Jun 88 16:59:16 GMT
References: <22065@amdcad.AMD.COM>
Reply-To: tom@nud.UUCP (Tom Armistead)
Organization: Motorola Microcomputer Division, Tempe, Az.
Lines: 55

In article <22065@amdcad.AMD.COM> tim@amdcad.AMD.COM (Tim Olson) writes:

[Concerning 88k branch delay slot handling ]

>This is the opposite of what the SPARC annulled branch does -- it
>squashes untaken branches.  Squashing the untaken branches seems more
>effective to me.  Take, for example, a simple loop:
>	load	r0, addr
>loop:
>	add	r0, r0, 1
>	store	r0, addr
>	add	addr, addr, 4
>	add	count, count, 1
>	cpge	bool, count, MAX
>	  jmpf	bool, loop		/* squashed on fall-through */
>	  load	r0, addr
>of the jump.  Since loops are usually executed many times, the
>annul-untaken form would seem to give the best overall performance.
>Any thoughts as to the benefits of annul-taken form?

   The same loop can be written in 88K asm as:  (I'm not familiar with
SPARC code so I hope this is equivalent - it illustrates the point 
anyway).

(addr, count, bool are registers I presume.)

loop:
	ld	r2,addr,0
	add	r2,r2,1
	st	r2,addr,0
	add	count,count,1
	cmp	bool,count,MAX
	bb0.n	eq,bool,loop		; This branch effectively takes
	add	addr,addr,4		; only 1 tick.


    The number of loop instructions is equivalent in either the 
annul-taken form or the always executed form (for this example
anyway).  The only slight difference is that no "cleanup" instruction
was required in the always executed form.  There might be some instances
where annul-taken is better but I don't know of any specific ones. 

    As an aside, the 88k has some addressing modes which will allow the above
code to be written more efficiently as:

	add	count,r0,MAX-1
loop:
	ld	r2,addr[count]
	add	r2,r2,1
	st	r2,addr[count]
	bcnd.n	ne0,count,loop		; This branch effectively takes
	sub	count,count,1		; only 1 tick.
-- 
Just a few more bits in the stream.

The Sneek