Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!mips!spool.mu.edu!uunet!mcsun!ukc!acorn!john From: john@acorn.co.uk (John Bowler) Newsgroups: comp.arch Subject: Re: Optimising C compiler question Summary: Don't use register... Keywords: register Message-ID: <6389@acorn.co.uk> Date: 15 Apr 91 15:36:19 GMT References: <1991Apr8.193155.3911@vax5.cit.cornell.edu> <1991Apr11.003431.24918@alzabo.ocunix.on.ca> Distribution: comp Organization: Acorn Computers Ltd, Cambridge, UK Lines: 145 In article <1991Apr11.003431.24918@alzabo.ocunix.on.ca> andras@alzabo.ocunix.on.ca (Andras Kovacs) writes: >umh@vax5.cit.cornell.edu writes: >>C has provision for register variables, which are supposed to run faster than >>standard variables. How come one never sees register variables in any code? (Try looking at traditional UNIX toolkit code :-) >>Are modern RISC compilers sufficiently good that they automatically make >>sensible choice of register variables? Can I make my code run slower by using >>them? > > My compiler is Norcroft ARM C V1.54A (admittedly an early release). Very, very early indeed :-). > It puts >the first 10 variables into registers and the others are loaded/stored on >demand. Now using 'register' means the same as moving the variable declaration >into the first 10. Norcroft for the ARM no longer does this. Register colouring was implemented a long, long time ago (well, several years). For a long time norcroft ignored the register directive completely (apart from the traditional winge if you tried to take the address of a register variable :-). One of the more recent changes was to start taking note of the declaration again - it really will put the variable into a register if you ask it. When compiling ``traditional'' unix code I use the definition:- #define register ___type (___type is a built-in type with no attributes, originally added to allow a safer offsetof macro; it means that the traditional abberation:- register a; still compiles!) > Would be the compiler better, it would analyze the code and >decide register usage based on that analysis; In addition to this Norcroft now does proper variable life time analysis, so that the compiler knows when a variable is *really* no longer required, and hence knows when a register can be reused. > but then 'register'-ing a var >could have two possible effects: > 1, It is obeyed - but then better if you know what you are doing otherwise > you can indeed slow down the code, or > 2, Disregarded - because the compiler trusts itself that indeed it knows the > best allocation scheme. I used to favour (1), but the problems caused by wanton addition of register declarations to code just because it used to be compiled by pcc (;-) have since caused me to favour (2). In theory there are isolated cases where the programmer really does know that the static analysis which the compiler does will give the wrong result. In practice very few programmers have the necessary training to be able to recognise these circumstances, and, of those who can do it, very few have the inclination. IMHO it is normally far better to *restructure the code* so that the static analysis is correct, or doesn't matter. Normally if the compiler cannot understand it neither can I. > I assume that good compilers use the second approach - not out of disregard >to the programmer but either the programmer asks for the right var to be >register and then it is already; or the compiler KNOWS that the register var >would cost execution speed and then what is the point of using it? > > I hope my view is not too simplistic; could someone with actual experience >follow up on the subject? My experience is limited mainly to Norcroft and the ARM. I cannot see any justification for using ``register'' in this environment. If the compiler chooses to put the wrong thing into a register I would much rather have the compiler writers fix the compiler than attempt to fix all my code. (If fixing the compiler wasn't an option I would choose a different approach, such as recoding the time critical part in assembler). It is certainly true that the use of ``register'' is inherently non-portable; either it is being used for some machine specific reason (eg, for its effect on variable values after a BSD UNIX vfork system call), or it is being used to provide a speed-up. In the latter case the benefit must be machine dependent; even if you only use one register declaration per function you could still upset compilers which do global optimisation. The real benefits come from telling the compiler things which it cannot know, rather than attempting to do its job for it. For example:- { int temp; function(&temp); /* Result of function not required */ ... /* temp is used as a temporary variable */ calculations involving temp ... } is much better written:- { { int temp; function(&temp); /* Result not required */ } { int temp; ... /* temp is used as a temporary variable */ calculations involving temp ... } } The compiler doesn't read the comments (;-) so it doesn't know that the (potential) aliasing of temp at the function call is irrelevant, as a result the code which performs the calculations is likely to be far less efficiently compiled (certainly if it calls any functions!). All the register declarations in the world will not help this, and yet similar things happen with monotonous regularity in typical C code. The original poster asked about declaring variables within blocks (as in the second piece of code above) - this is the thing to do! By declaring variables only where the programmer thinks they are needed, and by ``undeclaring'' them (by closing the block) when they are finished with the programmer helps the compiler by telling it quite clearly how long the values are needed. Most of the time a good compiler can work this out for itself however the cases where it cannot are often the cases where its register allocation will screw up. Incidentally, someone else observed that this can be expensive because the compiler may insist on allocating space for variables when blocks are entered rather than at the entry to the function. Clearly the compiler does not *need* to do this; this is just a quality of implementation issue. Norcroft *does* do this, but the code overhead is, at most, a subtraction from the stack pointer on block entry and an addition to it on exit (on the ARM; in some cases the addition or subtration may be combined with the first operation which stores a value to the stack). This is well worth it because of the saving in stack space - very important in the market at which the ARM architecture is aimed (cheap (<$1000) (RISC) PC's). John Bowler (jbowler@acorn.co.uk)