Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!aplcen!uakari.primate.wisc.edu!brutus.cs.uiuc.edu!apple!fox!portal!cup.portal.com!pgl
From: pgl@cup.portal.com (Peter G Ludemann)
Newsgroups: comp.lang.prolog
Subject: Re: Prefix operators and blanks (was: WG17)
Message-ID: <25053@cup.portal.com>
Date: 15 Dec 89 05:51:36 GMT
References: <24703@cup.portal.com> <1412@gould.doc.ic.ac.uk>
Organization: The Portal System (TM)
Lines: 62

It's precisely the kinds of problems which Chris Moss mentions
which prompted my posting.

If the Edinburgh public-domain parser is unambiguous, then
we ought to have a precise grammar (BNF, W-grammar or whatever)
which describes the language it generates.  The current N40
draft uses some formal notation and a fair bit of English 
(I discovered that I was looking in the wrong place for the
rules on prefix operators and blanks) but there is still lots
of ambiguity (I've already given some examples in an earlier
article).

From the various postings to the net, it is clear that there are
many different philosophies (and implementations) of the bits
which are not precisely defined in Clocksin&Mellish or other
reference books.

So, I suggest one of the following be adopted:

1. Enshrine the public-domain Edinburgh parser (where can
   I get a copy, by the way?).  This would certainly be
   novel (I wonder if ISO would accept it?).

2. Take the current grammar and make sure that the English
   comments are well-organized and precise (this would also
   be novel; every language standard that I have read has
   somewhere had a formal description of the grammar).

3. Make an unambiguous grammar.  This is not easy, but I
   think that it is necessary.  The job can be made much
   easier by disallowing most of the current ambiguities
   (they probably aren't used much; and eliminating them
   gives the parser a better chance at generating meaningful
   errors).  

   Some of things which would be disallowed are:
	operators used as ordinary atoms (they must be
		either quoted or inside parentheses).
		Thus, `f(+,-)' would be illegal but
		`f('+',(-))' would be legal.
	right-to-left and left-to-right operators of the
		same priority juxtaposed.
	similarly with the various cases of prefix/infix/suffix
		which I gave in an earlier note.

[Incidentally, there is nothing wrong with the lexical analyzer
distinguishing between quoted-atom and unquoted-atom; this makes
the grammar a bit bigger (every case of "atom" must be replaced
by "quoted-atom | unquoted-atom", except in a few places such as
prefix operators).  As it, there is some tie-in between the
parser and the lexical analyzer, for example with `-1' (Quintus
has unexpected behaviour with `-1' and `- 1', by the way, not
documented anywhere that I could see).]

No matter what is chosen, some programs will break.  As to
parsers breaking -- well, a few days' work should take care of
that (I have written 3 parsers and speak from experience).  We
MUST have an unambiguous grammar and we MUST make sure that when
a program is taken from one implementation to another, the second
implementation does not silently change the program's meaning.

- peter ludemann	--- my opinions are my own responsibility ---