Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uflorida!novavax!twwells!bill
From: bill@twwells.com (T. William Wells)
Newsgroups: comp.lang.c
Subject: Re: precedence of && (was: precedence of ?:)
Message-ID: <1989Sep14.175841.8086@twwells.com>
Date: 14 Sep 89 17:58:41 GMT
References: <1265@gmdzi.UUCP> <11030@smoke.BRL.MIL> <11039@smoke.BRL.MIL> <3236@solo10.cs.vu.nl> <11045@smoke.BRL.MIL> <3242@solo12.cs.vu.nl> <11054@smoke.BRL.MIL>
Organization: None, Ft. Lauderdale, FL
Lines: 87

In article <11054@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
: I seem to recall Dave Prosser telling me that the Standard's grammar
: for C (apart from the preprocessor) constituted a "phrase structure"
: grammar.  Perhaps if I knew what that meant I'd understand the
: rationale behind these particular expression parsing rules.

The term "phrase structure grammar" is pretty much equivalent to
"specified by a pure BNF". It also implies that other language things
that are not directly specified in the grammar can nonetheless be
specified in terms of the grammar, i.e., that not only does the
grammar accept C but that, by and large, the parse that results
makes some kind of sense.

C can be specified by an LALR(1) grammar if the lexical scanner
returns tokens that distinguish type names from other identifiers. If
it doesn't, there is no unambiguous context free grammar for it. (This
from some code that demonstrated that the language is ambiguous; if
the language is ambiguous, then there is no unambiguous grammar for
it.) The importance of LALR(1) is that there are a number of widely
used tools (like Yacc) that accept LALR(1) grammars.

(BTW, Doug, I expect that you know much of this, but I figured I'd
say it for everyone else.)

: It does appear that the current rules avoid parsing ambiguity.  That's
: probably a worthwhile constraint.

Very much so. One can write an ambiguous grammar and then have
extragrammatical constraints on the parse, to make it unambiguous.
Yacc does this and, for certain kinds of grammars, the parsers are
smaller and faster than they would be if the grammar were written
unambiguously and without the extra constraints.

But, having the grammar specification itself ambiguous just means
that you will end up with portability problems. Just as has happened
with the ?: and = situation.

---

The way to determine if a piece of text is legal C is to try to parse
it with the grammar. If that fails, it isn't legal C. Then look at
each phrase of the parse and see if there are any constraints that
are violated. If so, again it isn't legal.

The ?: operator is specified in the grammar by:

	conditional-expression:
		logical-OR-expression
		logical-OR-expression ? expression : conditional-expression

The assignment operators are specified by:

	assignment-expression:
		conditional-expression
		unary-expression assignment-operator assignment-expression


A logical-OR-expression can't have, directly, an assignment. If there
is one, it has to be in parenthesis. Thus, expressions like:

	a = b ? c : d

have to be parsed as a = (b ? c : d) and this is how they have always
been. On the other hand,

	a ? b = c : d

should parse as a ? (b = c) : d but I've seen compilers that refuse
to parse it, due to the fact that the specification in K&R didn't say
exactly what kind of expression belonged in the middle. Such are the
perils of not having a formal specification. Finally,

	a ? b : c = d

can't be parsed as (a ? b : c) = d, since a ? b : c isn't a
unary-expression. It can, however, be parsed as a ? b : (c = d). As
far as I know, there are no compilers that this breaks on.

Similarly one analyzes a && b = c: the operands of && are specified
as logical-AND-expression and inclusive-or-expression, neither of
which can contain an unparenthesized assignment. And the left operand
of = must be a unary expression which a && b certainly isn't. Thus
this expression is not legal C.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com