Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!uunet!mcsun!unido!ecrcvax!micha From: micha@ecrcvax.UUCP (Micha Meier) Newsgroups: comp.lang.prolog Subject: Re: Ambiguities in WG17's syntax Message-ID: <798@ecrcvax.UUCP> Date: 8 Dec 89 11:12:45 GMT References: <24702@cup.portal.com> Reply-To: micha@ecrcvax.UUCP (Micha Meier) Organization: ECRC, Munich 81, West Germany Lines: 74 In article <24702@cup.portal.com> pgl@cup.portal.com (Peter G Ludemann) writes: >In an earlier article, I said that I had tried to make an LR(k) >grammar for Prolog. Richard O'Keefe said that no programming >language is LR(k). Of course, Algol-60, C, Pascal, etc. are >not LR(k) or even context-free. Like all computer languages, >they are context-sensitive. But they can easily be made >context-free by adding a bit of intelligence to the lexical >analyzer. And the same can be done for Prolog. I'm afraid this is not the case. No matter how clever your lexical analyzer is, the parser has to cope with terms like prefix infix infix infix ... which cannot be resolved in k steps. In the following 'p' means prefix, 'i' infix, upper case letters mean that the corresponding symbol is parsed as functor, lower case means that it is parsed as an atom, ! means the main functor of the whole term. 1) p. 2) P! i. 3) p I! i. 4) P! i I i. or P i I! i. 5) p I! i I i. or p I i I! i. 6) P! i I i I i. or P i I! i I i. or P i I i I! i. ... As you can see, on even lines the odd symbols represent functors and the even ones atoms, on odd lines it is the opposite. If there are several alternatives in one line, they can be resolved using the precedence and associativity, but until your parser knows how many symbols in sequence there are, it cannot start to apply these rules. This is an instance of the general problem that the associativity and precedence rules allow you to specify which functor binds more than the other one, but it does not tell you which symbol is to be taken as functor. This problem can be resolved e.g. by imposing that an atom which has been declared as a prefix operator always keeps its precedence, even if used as atom, like in Quintus. This of course makes the grammar less clear to normal users but seems to be a good compromise. Another problem concerns operators which are both infix and postfix. When you immagine a sequence of such operators, there is 2^(n-1) of possible interpretations, e.g. for 4: a Ip! ip Ip ip a Ip ip Ip! ip a Ip! ip iP iP a Ip ip iP iP! a iP Ip! ip iP a iP Ip ip iP! a iP iP Ip! ip a iP iP iP iP! (Ip means it is taken as infix, iP as postfix). Here, every symbol, except the first one, can be interpreted either as atom or as functor, although the precedence and associativity can help. Quintus 2.0 and others have the non-documented feature that (a ip ip) is parsed as ip(a, ip) even if the postfix precedence is lower than the infix one: | ?- op(200, yf, a), op(300, xfy, a). yes | ?- display(b a a). a(b,a) In fact, nobody has yet specified whether in this case the precedence applies or whether the term is ambiguous. We have solved this problem in Sepia by forbidding the infix/postfix combination and this seems to be a reasonable restriction. --Micha Meier