Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site utcsri.UUCP Path: utzoo!utcsri!greg From: greg@utcsri.UUCP (Gregory Smith) Newsgroups: comp.lang.c Subject: Re: short circuit evaluation Message-ID: <4211@utcsri.UUCP> Date: Mon, 23-Feb-87 20:52:57 EST Article-I.D.: utcsri.4211 Posted: Mon Feb 23 20:52:57 1987 Date-Received: Tue, 24-Feb-87 03:45:34 EST References: <844@wanginst.EDU> <4700004@uiucdcsm> Reply-To: greg@utcsri.UUCP (Gregory Smith) Organization: CSRI, University of Toronto Lines: 83 Summary: reasons for compilers rearranging stuff. In article <4700004@uiucdcsm> mccaugh@uiucdcsm.UUCP writes: > > This is not submitted as a criticism to the foregoing erudite discussion of > side-effects, but is rather an innocent question about C in particular: why > does C refuse to abide by the associativity/precedence rules for expression- > evaluation that even BASIC guarantees? I can well understand "optimization" > as an excuse but can easily imagine cases where normally-evaluated expressions > can crash a system when "optimized" for eavaluation without the programmer's > express consent. Isn't it a little arbitrary for C to mnaipulate the parts > of an expression to its satisfaction (or whim)? In particular, this renders > the formal verification of C-code impractical. C does abide by associativity and precedence rules, which are laid out in the appendix of K&R. A-B+C means (A-B)+C and not A-(B+C). A*B+C/D means (A*B)+(C/D), and not A*((B+C)/D). What you are complaining about is sequence of evaluation. If I write a=b*c+d*e, do I care which multiply is done first? The reason you have not seen anybody complaining about this in other languages, is that these other languages do not have side-effects in expressions (at least not to the same extent). If I write a=foo()+bar(), I may indeed care whether foo() is called before bar(), in which case I *won't write that*. The problem has arisen through the notation we use in writing expressions. An expression has a tree structure, but it is written in a linear left to right fashion. Associativity and precedence rules serve merely to define how an expression tree is extracted from its linear form. Unfortunately, the linear form imposes an artificial ordering on an expression. The '+' operator is perfectly commutative, so that A+B and B+A are the same. Unfortunately, you have to write either A or B first. Even an operator like / has the same problem: if there were an 'under' operator \ you could have your choice of A/B or B\A; the justifiable lack of a \ operator forces you to write A before B even though they are at equal levels in the expression tree. The philosophy of the C language is that these limitations of the linear notation should not be allowed to restrict code generation. In an expression where two subexpressions X and Y must be added together, it may be vastly more efficient to evaluate Y before X. The programmer is not able to determine this, and is nevertheless forced to write one before the other. Putting it another way, a tree which represents an expression may be converted to linear form ( traversed ) in many ways. Some ways will be more efficient for evaluation, and that is hopefully what the compiler will generate. Some ways may be more useful for human reading, and that is what the programmer will write ( E.g. in a divide, the human always writes the dividend before the divisor, due only to lack of the aforementioned '\' operator ). As long as the same tree is involved in both cases, what's wrong with that? It is worth noting at this point that the parentheses '(' and ')' serve only in the process of converting a linear expression to a tree. Thus (a+b)*c is different from a+b*c but a+(b+c) can be treated the same as (a+b)+c. Simply allow your tree to contain a node which forms the sum of its three, equally ranked, children, and throw in the fact that redundant ()'s grow in C like mushrooms. Then a+(b+c) should be treated as a+b+c or SUM(a,b,c). ( I know, there are overflow considerations). If the expression tree contains no side-effects, it will make no difference in which order it is evaluated (just like in BASIC). However, if one branch of the tree changes the value of X and another branch uses the value of X, the result will depend on which branch is done first. Some would say , "the compiler should detect such cases and then use the order of the original expression". However, it is *impossible* to detect all of these cases at compile time. So the programmer is simply warned of the cases where expressions may be rearranged, and avoids expressions which work differently depending on how they are arranged. Note that these cases (where results are undefined) are *well defined* by the language, so you know what to avoid. [ I know this is oversimplification and doesn't deal with delaying of side-effects, or with sequence points, etc.. but I wanted to explain this stuff to those who just can't fathom what all the fuss is about and why the damn compilers don't just do what they're told ] -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...