Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!ames!amdahl!chuck
From: chuck@amdahl.amdahl.com (Charles Simmons)
Newsgroups: comp.arch
Subject: Re: taken -vs- untaken branches
Message-ID: <20339@amdahl.amdahl.com>
Date: 5 Jan 88 06:30:34 GMT
References: <496@cresswell.quintus.UUCP> <638@l.cc.purdue.edu>
Reply-To: chuck@amdahl.amdahl.com (Charles Simmons)
Organization: Amdahl Corporation, Sunnyvale CA
Lines: 48

In article <638@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
>                                                       The programmer
>(or the algorithm designer) usually knows [whether a branch is more
>likely to be taken or untaken;]
>                       I suggest that there be a way to tell the compiler
>which, and also that the architecture should use a bit in the branch
>instruction to tell the machine which.
>-- 
>Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
>Phone: (317)494-6054
>hrubin@l.cc.purdue.edu (ARPA or UUCP) or hrubin@purccvm.bitnet

Here's an idea for how a compiler might be able to automagically
make a reasonable guess for determining whether or not a branch
is more likely to be taken or not taken.  Feedback is welcome.

Some analyses of program behavior suggest that most conditional
expressions are comparisons against the constant zero.  After
this, there are a large number of comparisons against a constant.
One method of allowing a compiler to automagically guess whether
a branch will usually be taken or not, would be for the compiler
to estimate the number of legal values that one side of the expression
can take on, and the number of legal values that the other side of
the expression can take on.  These two estimates can then be used
to estimate how often the expression will be TRUE.

For example, consider the fragment of code:

    if (p == NULL) {
	...
    }

Here, the compiler might think "'p' can take on 2**32 values,
NULL can take on one value, so the estimated number of times
these two expressions are equal is 1/2**32.  Since this number
is less than 1/2, the condition will usually be false."

Obviously, this algorithm is going to be most effective when
the programmer is comparing for equality (or inequality) against a
constant.  This leads to a simplification which might be effective
in a large percentage of comparisons.  If a compiler generates
code to test a variable for equality against a constant, it might
assume that the test would usually be false; similarily, when
generating code to test for inequality against a constant, the
compiler would assume the test would usually be true.

Comments?

-- Chuck