Path: utzoo!utgpu!cunews!cognos!jimp
From: jimp@cognos.UUCP (Jim Patterson)
Newsgroups: comp.arch
Subject: Re: Optimising C compiler question
Message-ID: <9537@cognos.UUCP>
Date: 16 Apr 91 01:08:20 GMT
References: <1991Apr8.193155.3911@vax5.cit.cornell.edu> <1991Apr11.003431.24918@alzabo.ocunix.on.ca> <9526@cognos.UUCP> <1991Apr15.134417.24380@bigsur.uucp>
Reply-To: jimp@cognos.UUCP (Jim Patterson)
Distribution: comp
Organization: Cognos Inc., Ottawa, Canada
Lines: 79

In article <1991Apr15.134417.24380@bigsur.uucp> coopteam@bcarh680.bnr.ca (Jean Laliberte) writes:
>In article
>    <9526@cognos.UUCP>
>    jimp@cognos.UUCP (Jim Patterson)
>writes:
>>I don't think a compiler can ever know absolutely that a given variable
>>should be in a register whereas another should not, for "performance".
>>
>>[description deleted]
>>[example deleted]
>>
>Is there some reason register allocation couldn't also be dynamic? In
>this example, i would be stored in the register for the most part,
>but when the if() statement evaluates to true, 'store i' and 'load j'
>instructions would be executed, and j would then be the register
>variable (when the if() condition is over, i would be put back in the
>register).

There's no reason but it's obviously a trade-off as well. If you're
loading it for a single reference, it might be more efficient to
reference it on the stack. That is, there is a cost to loading and
storing registers which needs to be factored into the overal cost
computation.  You also have to consider code which needs to reference
both variables e.g. the test "j < i" in the example I gave.

Actually, the point is really that the compiler need not put j into a
register at all since it's in code that won't be executed most times
the program is run.

I agree, it's an over-simplification to assume that only one of the
two variables can go into a register. Consider instead a situation
where the "for-i" block either does it's "free-memory" loop using j,
or another piece of logic using a variable "k" which is kept across
iterations.

for (i=0, k=-1; i<10000; ++i) {
    e=malloc(sizeof(element));
    if (!e) {
       for (j=0; j<10000; ++j)
          if (j < i)   
              free(table[j]);
       getout(); /* Leave routine now */
    } else {
       if (k >= 0)
         table[k]->link = e;
       ++k;
    }
    table[i] = e;
}
table[k]->link = 0;

If the compiler considers both paths equally likely it may
continue to assign the single register to j, or could dump and load j
and k into the single register as required. However, we have information
the compiler doesn't, to wit that the "j" path will not normally be
executed, so we can see that allocating a register to j but not k, or
spending any additional time in the "k" code segment to optimize the "j"
code path, will not benefit performance. The best-performing compiler
will put k into a register (if it has two), and leave j as a memory
variable.

I know that many compiler optimizers are quite sophisticated, and
likely do a better job than a naive programmer putting "register" on a
few declarations. In fact even an experienced programmer would not
likely find a register allocation strategy that would do better than a
good code optimizer over a range of architectures.  My point is simply
that due to lack of information, no optimizer can provide an optimal
register allocation for best performance (i.e. speed) of a specific
application and specific data inputs. The example cites a situation
where a "statistical" guess as to the code's behaviour is likely to
lead to wrong decisions by the optimizer. 


-- 
Jim Patterson                              Cognos Incorporated
UUCP:uunet!mitel!cunews!cognos!jimp        P.O. BOX 9707    
PHONE:(613)738-1440 x6112                  3755 Riverside Drive
                                           Ottawa, Ont  K1G 3Z4