Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!zaphod.mps.ohio-state.edu!think!yale!cmcl2!lanl!lambda!jlg From: jlg@lambda.UUCP (Jim Giles) Newsgroups: comp.lang.misc Subject: 'register' variables and other goodies (was Re: Common subexpression optimization) Message-ID: <14226@lambda.UUCP> Date: 6 Feb 90 01:00:40 GMT References: Lines: 184 From article , by pcg@rupert.cs.aber.ac.uk (Piercarlo Grandi): > [... 'register' is only a 'noalias' substitute - I said ...] > > Well, there is also actually a strong *hint* that the variable will > be heavily *dynamically* used across statement boundaries, in the > current scope. The compiler can use this hint very effectively. Such a hint is only useful to an exceptionally dumb compiler. Most modern compilers do better data flow analysis than human programmers are willing to do. Studies have shown that compilers can consistently do a _very_ good job of register allocation without such hints. In fact, most C compilers simply ignore the 'register' attribute except to verify that the variable is never used with an 'address-of' (&) operator. In the overwhelming majority of programming environments, the only way to beat the compiler's code is to switch to assembly. The use of 'register' or other such tricks is insignificant (sometimes even damaging). > [...] > A careful programmer will never declare a variable for a scope larger > than that in which it is used, [...] I disagree. A careful programmer will neither write extremely monolithic code, nor write code which is fragmented into lots of little scopes. If I have a 50 line function which uses XYZ in only the middle third, I'm not likekly to make a new scope just for XYZ. I _may_ split out the middle third _if_ it seems to be a distinct and nearly independent segment (but, then I would ask myself whether the code should actually all be in the same function at all). > [...] will never reuse a variable with the same > name for two different roles. Now, I agree with that. However, I make one caveat: in most languages including C, the idiom for the use of index variables in loops is single letter variables. Such variables are often used in separate loops (with non-overlapping scopes of course) without any sign of confusion that I've ever noticed. Of course, you _could_ argue that the variable _isn't_ being used in separate roles - it's always an index variable. It is interesting that you consider using the same name for two different things to be very evil and yet you consider having two names for a single object (aliasing) to be acceptable. For most people, the degree of evil is the other way 'round. > [...] > Of course this is because I am eccentric enough to think > that the performance characterization of a program is > part of its design and pragmatics should be as obvious > as semantics. One should only degrade the readibility of code with performance enhancing transformations when absolutely necessary: that is, when the performance of the code would otherwise be unsatisfactory. This usually applies only to a very small part of the code for any given project. Most of the code is under another sort of optimization pressure altogether: the pressure to work correctly, to get written quickly, to be maintainable, to be easy to enhance when new demands are made on the program, etc.. Even the code that needs to work _FAST_ should first be written as clearly as possible and only _then_ optimized. > [... Other languages don't usually _need_ 'register' ...] > Let me differ. Fortran has had equivalence and common forever, > and even if dirty tricks are prohibited in theory, most compilers > cannot assume users are well behaved [...] There is a difference between aliasing and storage association. Common blocks can cause storage association between different objects - but _NOT_ within the same scope. The only type of aliasing possible with common is passing common variables through the argument list. But this is illegal! And most compilers _DO_ assume that such aliasing has not occurred. As for equivalence: that is a _LOCAL_ declaration. The compiler can clearly see what variables are aliased and what variables aren't. There is no need for a 'register' attribute to declare this information, the compiler is already explicitly aware of it. Fortran 90, on the other hand, has introduced pointers which can point to other (non-dynamic) objects. In this language, something analogous to the 'register' attribute is needed. The method chosen by the committee was symmetric to the C solution: Fortran 90 has the POINTEE attribute, which tells the compiler that an object _may_ be aliased. This means that the default attribute of most Fortran 90 objects is effectively 'register'. > [...] Pascal has 'var' parameters, Yes, but Pascal is not separately compilable. The compiler can do a complete interprocedural dataflow analysis to find out _unambiguously_ what arguments might be aliased with what global variables. > [...] and, more murkly, variant > records without discriminant. C _also_ has variant records without discriminants! They're called unions. The kinds of problems caused by such things are usually type-coercion problems and have nothing to do with 'register' attributes or aliasing in the hidden sense. After all, the declaration of such a union is clearly visable in any scope that can reference any field in it - so the compiler can already determine that overlapping fields _might_ be aliased (just like Fortran's EQUIVALENCE in fact). > [...] > the absence of presence of 'noalias' (and 'volatile') does change the > semantics of a program, while for 'register' this is not true; [...] This is false. The 'register' attribute should have no effect on the _semantics_ of a correct program. If you don't use the address-of (&) operator on a variable, the presence or absence of 'register' on that variable should have NO effect on the semantics of the code. If it does, I suspect that you have a broken compiler! > [...] 'register' > also gives a *positive* hint on usage frequency. Yes, one that is practically useless on a good compiler. In fact, if the compiler makes an effort to put your 'register' vars into registers, it may actually _inhibit_ optimal register utilization. The fact of the matter is, usage frequency is one of the things I expect the language environment to tell ME, not the other way around. The compiler and the run-time profiler are where that information comes from - not from my gut feeling. Now, if the profiler could feedback information into the compiler for a subsequent compile, THAT might be a useful feature. > [...] > With 'register' safety (no aliasing) is a side effect, but a > clever one, under more than one aspect. Actually, no aliasing is about the _only_ useful feature of the 'register' attribute. And it is one that is basically unneeded in other languages. Even so, it's not all that useful - unless you like to explicitly copy-in copy-out all your global variables that you wish to use. And it doesn't help with array manipulation at all (which are still pointers and are assumed aliased to everything that's not 'register'). > [...] > What ruins the alias show for most languages is either separate compilation > or parameter passing, even where pointers are not present. It's not either-or. The problem arises only with _both_ parameter passing and separate compilation. Without separate compilation, the compiler can look at the whole call tree to detect any possible aliasing. Without call by reference there is no problem (and this is the only parameter passing mechanism which causes the problem). Even so, a run-time test for aliasing would be cheap and easy to implement, but the current lack of interest indicates to me that there really isn't much of a problem. Compilers _DO_ optimize as if such aliasing has not taken place and they _DON'T_ usually test for aliasing - yet the frequency of such errors is very small. > [...] And even if > this not true, the compiler still has to guess which of the safe variables > are actually worth caching. But, this is something that the compiler is usually _very_ good at. Better than most people have time to be. (And, if you _do_ have time, you're better off going to assembly where your register declarations aren't hints but _orders!) > [...] > Naturally this is a moot point on many of today's architectures, > where register optimization is entirely unnecessary, as there usually > far more registers available than variables [...] Additional registers aren't a panacea. For one thing, as the number of registers increase, the register allocation schemes have gone 'cosmopolitan'. There is an increasing effort to keep data in registers across procedure calls. The less efficiently codes use the registers, the less efficiently the code runs - even if there are a massive number of registers. For another thing, register utilization is not the only thing inhibited by aliasing! All those multi-register machines you're talking about are also likely to be pipelined. Pipelining is inhibited by possible aliasing as bad (or worse) than register utilization. And there's NO source level control (in C or anywhere else) over code ordering optimizations on the pipelining level. The best thing is for the language to be designed in such a way that aliasing is difficult and rare and is only forced upon the user when it is actually part of the functionality he needs. J. Giles