Path: utzoo!utgpu!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!bloom-beacon!bu-cs!purdue!i.cc.purdue.edu!k.cc.purdue.edu!l.cc.purdue.edu!cik From: cik@l.cc.purdue.edu (Herman Rubin) Newsgroups: comp.arch Subject: Re: A simple question on RISC Summary: Most of the time, branch at the end is much better Message-ID: <1019@l.cc.purdue.edu> Date: 19 Nov 88 11:11:31 GMT References: <6544@xanth.cs.odu.edu> <75577@sun.uucp> <1618@imagine.PAWL.RPI.EDU> Organization: Purdue University Statistics Department Lines: 77 In article , jk3k+@andrew.cmu.edu (Joe Keane) writes: > Robert Firth writes: < > The only language that defines a FOR loop that is always executed at least < > once is Fortran, and it requires the sign of the step to be known at compile < > time. > > C's do-while is uncommon, but they haven't removed it yet. > < > For all other languages, you have to hand code a compare at the top of the < > loop anyway, so it's pretty pointless to code it again at the bottom. < > The naive way to code `for (i = lower; i < upper; i += step) foo;' is: > > mov lower,i > test cmp i,upper > blt end > foo > add step,i > bra test > end > < > (Well, actually you put the compare at the bottom and have an initial branch < > to it, but you see what I mean.) > > Good idea; let's try it: > > mov lower,i > bra entry > body foo > add step,i > entry cmp i,upper > bge body > > Statically, same number of instructions; dynamically, less except in the > zero-trip case. But look, the add, compare, and branch are consecutive. > > Suppose we want to eliminate the unconditional branch. We can duplicate the > test at the top. Or the compiler may figure out that the initial test can't > fail (mostly if the loop has constant bounds). In either of these cases, the > add, compare, and branch are still consecutive. > > --Joe Even if we do not eliminate the conditional branch, it should be pointed out emphatically that the second version is likely to be faster. With the conditional branch at the beginning, we have two consecutive branch instruc- tions at run time, and this can be quite expensive timewise. The other way, only one branch instruction is run PER LOOP. And Joe's remark that the programmer or the compiler may know that the loop will be run at least once is important. If the language does not allow the programmer to provide this information, this is a built-in machine-independent deficiency in the language. The only time the branch-at-the-beginning wins is when the loop should not be performed. If this is an important consideration, a third alternative is: mov lower,i test cmp i,upper bge lpexit body foo add step,i cmp i,upper bge body lpexit This is one instruction longer, and meets the speed of the branch-at-the- beginning code if the loop is not to be executed, but will beat it if the loop is executed at least once. It may be marginally slower than Joe's if the loop is carried out. I would recommend this one if zero execution steps are of moderate importance. But I find it hard to see any justification for Robert's objection to having two conditional branches in the CODE, one being in the loop, rather than having a conditional and an unconditional branch in the LOOP. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)