Path: utzoo!utgpu!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!bloom-beacon!bu-cs!purdue!i.cc.purdue.edu!k.cc.purdue.edu!l.cc.purdue.edu!cik
From: cik@l.cc.purdue.edu (Herman Rubin)
Newsgroups: comp.arch
Subject: Re: A simple question on RISC
Summary: Most of the time, branch at the end is much better
Message-ID: <1019@l.cc.purdue.edu>
Date: 19 Nov 88 11:11:31 GMT
References: <6544@xanth.cs.odu.edu> <75577@sun.uucp> <1618@imagine.PAWL.RPI.EDU> <oXV-=Wy00V4109jmdK@andrew.cmu.edu>
Organization: Purdue University Statistics Department
Lines: 77

In article <oXV-=Wy00V4109jmdK@andrew.cmu.edu>, jk3k+@andrew.cmu.edu (Joe Keane) writes:
> Robert Firth writes:
< > The only language that defines a FOR loop that is always executed at least
< > once is Fortran, and it requires the sign of the step to be known at compile
< > time.
> 
> C's do-while is uncommon, but they haven't removed it yet.
> 
< > For all other languages, you have to hand code a compare at the top of the
< > loop anyway, so it's pretty pointless to code it again at the bottom.
< 
> The naive way to code `for (i = lower; i < upper; i += step) foo;' is:
> 
>         mov     lower,i
> test    cmp     i,upper
>         blt     end
>         foo
>         add     step,i
>         bra     test
> end
> 
< > (Well, actually you put the compare at the bottom and have an initial branch
< > to it, but you see what I mean.)
> 
> Good idea; let's try it:
> 
>         mov     lower,i
>         bra     entry
> body    foo
>         add     step,i
> entry   cmp     i,upper
>         bge     body
> 
> Statically, same number of instructions; dynamically, less except in the
> zero-trip case.  But look, the add, compare, and branch are consecutive.
> 
> Suppose we want to eliminate the unconditional branch.  We can duplicate the
> test at the top.  Or the compiler may figure out that the initial test can't
> fail (mostly if the loop has constant bounds).  In either of these cases, the
> add, compare, and branch are still consecutive.
> 
> --Joe

Even if we do not eliminate the conditional branch, it should be pointed out
emphatically that the second version is likely to be faster.  With the
conditional branch at the beginning, we have two consecutive branch instruc-
tions at run time, and this can be quite expensive timewise.  The other way,
only one branch instruction is run PER LOOP.  And Joe's remark that the 
programmer or the compiler may know that the loop will be run at least once
is important.  If the language does not allow the programmer to provide this
information, this is a built-in machine-independent deficiency in the language.

The only time the branch-at-the-beginning wins is when the loop should not be
performed.  If this is an important consideration, a third alternative is:


         mov     lower,i
 test    cmp     i,upper
	 bge	 lpexit
 body    foo
         add     step,i
 	 cmp     i,upper
         bge     body
 lpexit

This is one instruction longer, and meets the speed of the branch-at-the-
beginning code if the loop is not to be executed, but will beat it if the
loop is executed at least once.  It may be marginally slower than Joe's 
if the loop is carried out.  I would recommend this one if zero execution
steps are of moderate importance.  But I find it hard to see any justification
for Robert's objection to having two conditional branches in the CODE,  one
being in the loop, rather than having a conditional and an unconditional branch
in the LOOP.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)