Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!brl-adm!adm!mnc@m10ux.uucp From: mnc@m10ux.uucp Newsgroups: comp.lang.c Subject: Re: Disregarding parentheses in C Message-ID: <7018@brl-adm.ARPA> Date: Tue, 21-Apr-87 11:01:16 EST Article-I.D.: brl-adm.7018 Posted: Tue Apr 21 11:01:16 1987 Date-Received: Wed, 22-Apr-87 03:20:53 EST Sender: news@brl-adm.ARPA Lines: 100 These arguments about whether C is "allowed to disregard parentheses" seem to be degenerating into confusion based on lack of precise terminology and/or understanding of the concepts involved. In particular, some people seem to be needlessly alarmed about the possibility of the compiler misinterpretating their parenthesized expressions. At the risk of appearing pedantic, I'd like to inject a couple of definitions and a little careful exposition into the conversation, in the hope of unmuddying the water. First the definitions (please bear with me -- I get to the point between and following the definitions): OPERATOR PRECEDENCE - a ranking of operators (e.g. +,-,*,/) in a language from "most tightly binding" to "least tightly binding". The purpose is to define the meaning of expressions of the form "v1 op1 v2 op2 v3", where the vn's are operands (such as numbers) and the op's are operators. In the example, assuming op1 has higher precedence (binds more tightly) than op2, the meaning would be the same as if "(v1 op1 v2) op2 v3" were written. Notice the mnemonicity of "more tightly binding". Op1 grabs onto its arguments more tightly than op2, hence op1 gets to use v2 as its argument and op2 does not. That is, op1 must be applied first to v1 and v2, then the result supplied as left operand to op2. Conversely, if the precedence were reversed, the interpretation would be "v1 op1 (v2 op2 v3)". This leaves open the question of what to do when op1 and op2 have the same precedence (see ASSOCIATIVITY, below). Note that parentheses can always be used to explicitly alter the default precedence rules. For instance, in "(v1 op1 v2) op2 v3", op1 must be applied to v1 and v2, then the result used as the left operand of op2 (actually, I mean that the result must be as if the evaluation happened in this order, al- though any evaluation mechanism guaranteed to produce the right answer is okay). In C, as in mathematics, such parentheses cannot in general be ignored, because that would lead to the wrong answer. For instance, do not worry that any C compiler will treat "(5+6)*7", which is 77, as "5+6*7", which is 47. OPERATOR ASSOCIATIVITY - a set of operators of the same precedence (e.g., +,-) is defined to be either left or right associative. If left associative, an expression of the form "v1 op1 v2 op2 v3 ...", where all the op's are the same precedence, is to be treated the same as "((v1 op1 v2) op2 v3) ...". That is, each op is treated as having higher precedence than an op of same precedence (and at the same level in the expression) but farther to the right. In mathematics, and most computer languages, virtually all operators are left associative (with the possible exception of the power operator (x ** y ** z). Finally, we say that an operator is "associative" if it does not matter (to the final result) whether multiple occurrences of that operator in an expression are treated as left associative or right associative. For example, "+" is associative (in mathematics, and computers that don't overflow!), because (x+y)+z = x+(y+z), but "-" is not. Again, parentheses can be used to bypass the default associativity rules where needed. And again, C does not in general allow these parentheses to be ignored. No C compiler will treat "5-(6-3)" the same as "5-6-3" (but see the exception below). EVALUATION ORDER - The order in which operations are applied to values, during the computation of the value of an expression. In general, the set of possible evaluation orders is constrained by the precedence, associa- tivity of the operators in the expression and by the explicit use of parentheses. (Hence, those who say it is completely independent of parsing/precedence are just plain wrong -- it may be independent of the order in which a parse tree is constructed, but is not independent of the shape of the parse tree.) Often, there is more than one correct evaluation order. For example, to compute "2 * 3 + 5 * 7" one can evaluate "2 * 3" before or after "5 * 7", but both must be evaluated before "+". The cases where this definition gets sticky are when, due to the occurrence of associative operators, an evaluation order which would not ordinarily be allowed, is guaranteed to produce a correct result, such as evaluating "(4+5)+6" by doing "5+6" first then "4+11". Here is the crux of the confusion and disagreement we have seen in recent articles. Is the evaluation just de- scribed a legal evaluation order for "(4+5)+6"? I'm inclined to say, no, it is a legal evaluation order for the different, but mathematically equivalent expression "4+(5+6)", but this is semantics for semantics' sake. The evaluation of an expression is a private matter between consenting compilers and computers and is not any business of the programmer, given that the correct result is produced. Now finally, the main point: The source of this whole unpleasant business is that C is defined as though its "+" and "*" operators were associative (int or float, it doesn't matter to the discussion). That is it allows the evaluation of "x+y+z" (or even"(x+y)+z") as though they were written "x+(y+z)". This does not necessarily produce the same answer in the presence of overflow and roundoff errors, hence the C operators are NOT associative, unlike the corresponding ideal mathematical operators. Is this a bug in the C definition? It depends on your point of view (or your programming needs), I guess. To summarize, C does not allow compilers to "disregard parentheses" in general. It only allows the reparenthesization of the expression in a manner that would be mathematically equivalent if the operators were "ideal", i.e., perfectly accurate and never over- or underflowing. While this may be an unfortunate violation of some persons' notion of mathematical correctness, it allows a class of optimizations which the majority of C programmers apparently want. -- Michael Condict {ihnp4vax135cuae2}!m10ux!mnc AT&T Bell Labs (201)582-5911 MH 3B-416 Murray Hill, NJ