Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site utcsri.UUCP Path: utzoo!utcsri!greg From: greg@utcsri.UUCP (Gregory Smith) Newsgroups: comp.lang.c Subject: Re: short circuit evaluation Message-ID: <4315@utcsri.UUCP> Date: Thu, 5-Mar-87 17:36:39 EST Article-I.D.: utcsri.4315 Posted: Thu Mar 5 17:36:39 1987 Date-Received: Thu, 5-Mar-87 21:06:35 EST References: <844@wanginst.EDU> <4211@utcsri.UUCP> <609@viper.UUCP> Reply-To: greg@utcsri.UUCP (Gregory Smith) Organization: CSRI, University of Toronto Lines: 94 Summary: In article <609@viper.UUCP> john@viper.UUCP (John Stanley) writes: >The example Greg provides here is misleading. > >If you have three expressions: > 1: a+b+c > 2: (a+b)+c > 3: a+(b+c) >all three of them should not be evaluated the same way. Greg implys that >they should be. This is not so. When an equation contains '(' and ')' >it intentionaly (and explicitly) defines the parse tree structure that will >result. The statement "redundant ()'s grow in C like mushrooms" may be true, >but it doesn't give anyone the right to arbitrarily ignore explicit cues >to the compiler. When I don't care, I don't use them. When I do, I do >so for a reason.......... > How do you tell an explicitly defined tree from an implicit one? a+b+c parses to the same expression tree as (a+b)+c. That could be changed (i.e. () info could be included in the expr. tree ). But what about (a+b)*c? There is no way to represent that *expression tree* without parentheses. How do I then write this in such a way that the compiler is allowed to rearrange it? My point is that ()'s serve to override the binding strengths of operators, and thus allow arbitrary expressions to be constructed. They cannot be used to prevent optmizations, since they are *required* in many cases, simply to write the expression in the first place. I want my compiler to be free to change (i+3)*4+6 to i*4+18 (more on this later). If I don't want that, there are means of avoiding it. Furthermore, a redundant () may very well not be "intentional", in the applicable sense of the word, and therefore should not be construed as "explicit". more later. > Since there is additional, undesireable, and unnecessary overhead in the >detection of this SUM(a1,a2,a3...,aN) special case, and since there appears >to be little or no advantage to doing so (you have to add them up in -some- >order, you might as well let the programmer decide as anyone), why bother? > Wrong. There is an advantage to doing the sum in an arbitrary order. First of all, it rarely makes a difference to the result (if it does, it is the programmer's fault, by definition :-) ). It allows constants to be folded to the end (a+2)+(b+9)+4 = (a+b)+15. It allows fewer registers to be used. ((a+b)+(c+d))+((w+x)+(y+z)), if done literally, requires 3 working values at one point, while a running sum never requires more than one. The programmer cannot determine the best order, because the programmer is not writing code specifically for a certain machine, right? :-) the compiler is supposed to do that. Multiple constants in running sums tend to pop up (1) from macro expansions (2) in expressions like (p+3)->foo.y[i-1].abc ( after the semantics have been applied ). In the latter case the programmer can't control them because s/he can't really see them. One could of course distinguish the programmer-created weirdness from the internally-created ones, but why not optimize both of them? One cannot, of course, distinguish macro-created weirdness under the current preprocessor paradigm. And once you convert arbitrary additions into rearrangable running sums, it becomes very attractive to convert things like (i+7)*4 + 2 into i*4 + 28 + 2 into i*4 + 30. Again, this comes up a *lot* with array operations, and again, if it breaks a program, that program was probably doomed anyway. For more on this sort of thing, see "A Tour through the UN*X C Compiler" in the Version 7 books. PS if you don't believe me about the mushrooms, I give you the expansion of getchar(putchar()): (--((&_iob[1]))->_cnt>=0? ((int)(*((&_iob[1]))->_ptr++= (unsigned)((--((&_iob[0]))->_cnt>=0? *((&_iob[0]))->_ptr++&0377: _filbuf((&_iob[0])))))):_flsbuf((unsigned)((--((&_iob[0]))->_cnt>=0? *((&_iob[0]))->_ptr++&0377:_filbuf((&_iob[0])))),(&_iob[1]))); Most of those '()'s are in there to enforce precedence against arbitrary parameters. E.g. if I write #define INCH_TO_CM(x) x*2.54 then INCH_TO_CM(a+b) becomes a+b*2.54, which is wrong. To be safe, I have to write #define INCH_TO_CM(x) ((x)*2.54) >John Stanley (john@viper.UUCP) >Software Consultant - DynaSoft Systems >UUCP: ...{amdahl,ihnp4,rutgers}!{meccts,dayton}!viper!john -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...