Path: utzoo!mnetor!uunet!husc6!sri-unix!quintus!ok From: ok@quintus.UUCP (Richard A. O'Keefe) Newsgroups: comp.lang.prolog Subject: Re: BSI syntax Message-ID: <797@cresswell.quintus.UUCP> Date: 23 Mar 88 05:11:45 GMT References: <234@gould.doc.ic.ac.uk> Organization: Quintus Computer Systems, Mountain View, CA Lines: 490 In article <234@gould.doc.ic.ac.uk>, cdsm@doc.ic.ac.uk (Chris Moss) writes: > Forwarded for Roger Scowen -- KRG0@gm.rl.ac.uk > > RESPONSE TO COMMENTS FROM RICHARD O'KEEFE ON PROLOG STANDARDIZATION > > GENERAL RESPONSE > > Richard O'Keefe started by saying that he would respond to the > mailing from Chris Moss. In fact many comments refer > to a document (Prolog syntax, Draft 4.1) that > most news readers (and members of the ISO and BSI panels) will > not have seen. > This seems somewhat unfair on readers who will be unable to judge > whether draft, criticism, or rebuttal is justified. My postings were in fact a response to Chris Moss's mailing. They were not confined to the content of that mailing, true. It seemed to me that Chris Moss's mailing implied that the BSI syntax was in a satisfactory state, and that it wasn't as difference from the de facto standard as people feared. I set out to show that neither of those statements is true, and I believe that I succeeded. Many comments did refer to a document that most news readers won't have seen. But then, most news readers won't have seen ***ANY*** of the BSI documents. Am I then to say nothing? As for fairness to readers, (a) I was quoting from the very latest document I had. Surely it would be more unfair to quote from something I believed to have been superseded? (b) The "February 88" and "Feb 88" documents arrived in my mailbox here in the same week. I had no way of telling who had or had not received the document I was quoting. All I knew was that this was the latest document available, sent to me by the author. (c) In order to permit readers to judge for themselves whether my criticisms were justified, I quoted extensively from the document. I did not ask anyone to take it on faith that this or that was the case: where the grimoire appeared to say something particularly silly I exhibited the rules responsible. This is unfair? > First some general comments. The objective is to define an > International Standard for the programming language Prolog. > This means that standard conforming programs will run correctly > on standard conforming processors, neither more nor less. > It will not limit implementers from introducing new features and > facilities into their Prolog compilers. > > Neither will it mean programmers cannot use such extensions; only > that if they do, their programs will not conform to the standard. > This is a little misleading. The general rule in other languages is that implementors can add extensions, provided that such extensions are either illegal or undefined in the standard. Thus a Pascal compiler can provide alphabetic labels as an extension. But an implementor should not provide an extension which alters the meaning of a program which the standard would have ruled legal. Let's apply this to the case of :- read(_). directives in a file which is being consulted or compiled. Specifically, let's consider a file which looks like :- read(_). p(a). and has nothing else in it. Does this define p, or does it not? The BSI grammar, in all versions, provides the syntax of entire files: according to the grimoire this MUST mean exactly one directive followed by exactly one clause. Since this is a defined and legal file, it would be most improper for an implementor to give it any other meaning. Therefore, reading out of a file being compiled or consulted is NOT a permitted extension. (This wouldn't bother Quintus, but it is legal in some other Prologs.) Let's apply this to another case: functor/3. It has always been the case in DEC-10 Prolog that functor(1, 1, 0). In at least one draft of the BSI built-in predicates document, this has been required to raise an error. (BSI Prolog includes an error handling facility; needless to say it doesn't look like IF/Prolog's or M-Prolog's or ...) So a BSI conforming program is entitled to rely on this error being raised, and an implementor may NOT provide DEC-10 compatibility. The ANSI C committee have found it necessary to explicitly indicate which identifiers may be used by implementors. (The list includes all identifiers starting with "_" or "str" and there are others I can't remember right at the moment.) Why is this? Because the programmer needs a guarantee that the identifiers he has chosen for his code won't be in conflict with an implementation. For example, (not)/1 is not defined in the BSI stuff, so Scowen says that an implementation is free to define it. But if the implementation is free to do so, then the programmer ISN'T. Since setof/3 is not in the BSI Prolog language, a program which defines setof(List, Set) :- setof(List, [], Set). setof([], Set, Set). setof([Head|Tail], Set0, Set) :- ( member(Head, Set0) -> setof(Tail, Set0, Set) ; /* not member(Head, Set0) */ setof(Tail, [Head|Set0], Set) ). is a standard-conforming program. But a Prolog system which is exactly BSI except for providing setof/3 as an extension is a conforming processor. Will such a conforming program run correctly on such a conforming processor? You must be joking. So, taken in their ordinary sense, the claim that "standard conforming programs will run correctly on standard conforming processors", while true of some standards, is NOT true of the BSI work, unless "standard conforming processors" is construed very strictly as meaning "providing NO additional built-in predicates". You will recall that Fortran 77 provides the EXTERNAL and INTRINSIC statements precisely to cope with this problem, and that ANSI C provides the reserved-to-implementors list and #undef precisely to cope with this problem. BSI Prolog does have some reserved words, but is ludicrously far from providing a solution to this problem. > So some features of Edinburgh Prolog will not be in the standard > because although they fulfilled a need at one time, they are > not a sensible longterm solution. Let's be realistic. There are languages on the horizon which are much better approximations to logic programming than Prolog. (NU Prolog has been around for a while.) There are lots of software engineering needs which old Prolog completely failed to address, such as modules. (Last I heard, the consensus of the BSI Modules subcommittee was that they would probably never agree.) I think we ought to regard Prolog as a stopgap; and that the goal of the standard should be to protect EXISTING investments in Prolog. Frankly, advocates of BSI Prolog, with its use of user-supplied atoms as stream names, are in no position to talk about sensible solutions. ************************************************************************ ** It would be most interesting to have an explicit list of the features ** of Edinburgh Prolog which fulfilled a need at one time and are now ** disliked by the committee, and a description of their replacements. ************************************************************************ > > (4) The basic structure of the BSI approach to syntax has been > > to cut the Gordian Goose. That is, instead of regarding the > > (actually rather low) diversity of Prolog syntax as an > > opportunity to be solved by making the language more powerful > > (e.g. having a table-driven tokeniser), it has been treated as > > a problem to be solved by inventing a new, more restricted, > > language. > > Well, yes and no. Chris Moss has produced tests that give > different results on every system tested so far. Perhaps there > is rather more diversity than Richard O'Keefe realizes. > One objective has been to define a syntax where many existing > systems would not generally disagree on the meaning of > standard-conforming programs. The amount of diversity one perceives depends on which "Prolog" systems one decides to include in one's sample. My sample includes only systems whose implementors _tried_ to be Edinburgh (or at least Clocksin & Mellish) compatible. For example, AAIS Prolog is openly and frankly not an Edinburgh-compatible system. We may (and should) look to it for ideas, but we should not include it in a sample of "Edinburgh compatible" Prologs. BIM Prolog has its own unique syntax; while we should perhaps include the '-c' syntax of BIM Prolog in the sample, we should not include BIM Prolog's native syntax. If we go by numbers, then Turbo Prolog should determine the syntax of standard Prolog. If not by numbers, by what? Simple justice suggests that the Prologs to look at are the Prologs whose authors TRIED to be compatible with one another. Prudence suggests the same sample. But even if the diversity among the Prologs whose authors didn't suffer from NIH-itis is much greater than I believe, that doesn't answer my point. What I said was that the diversity should be regarded "as an opportunity to be solved by making the language more powerful (e.g. having a table-driven tokeniser)". [As an aside, this is no more than Lisp and PopLog already have.] It turns out that it is quite easy to write a tokeniser which can handle all of ALS Prolog Arity Prolog BIM Prolog native syntax C Prolog DEC-10 Prolog PopLog (nested comments) Quintus Prolog Stony Brook Prolog and can almost handle ADA [ADA is no longer a trademark], simply by fiddling with a table. AAIS took exactly this approach (though their tokeniser is not as flexible as mine). I found it necessary to support several kinds of quotes in my tokeniser: ATMQT - the quoted thing is an atom (') STRQT - the quoted thing is a string ($) LISQT - the quoted thing is a list (") CHRQT - the quoted thing is a character (`) Suppose the standard were to adopt this approach, then they could rule, if they wished, that the standard assignment was "->STRQT, with nothing being assigned LISQT. That needn't prevent me reading my existing code: I'd be able to change the table while reading my old files. [The best approach seems to be to associate a read table with a stream; naturally this is the approach PopLog takes.] What I have in mind here is that a file would start with a directive such as :- use_syntax(dec10). or :- use_syntax(standard). or :- use_syntax(als). Especially if the tokeniser were made available to user code (as it is in the DEC-10 Prolog library, or built-in in NU Prolog), the result would be a much more powerful language at very little cost to the implementor. And conversion from old dialects to the BSI dialect would be enormously simplified. Do we need to come up with a "best possible" tokeniser for the standard? Of course not. Again, what are we to do about syntactic variations, such as the treatment of operators? My answer, in 1984, was that the standard should not specify read/1 and write/1, but should specify standard_read/1 standard_write/1 and should allow users to redefine read/1 and write/1, but require that the initial definitions be the standard one. consult and compile should use read/1, not standard_read/1, so that someone who wanted to read M-Prolog files into standard Prolog could do so by suitably defining read/1. Now, if you are a self-appointed standards committee member determined to impose your vision of what is a "sensible longterm solution" on every Prolog user whether they like it or not, this sort of approach won't seem all that attractive. But if, like me, you think that the people who matter in all this are the people who have paid money to USE Prolog, and if, like me, you think that the fact that M-Prolog is appalling is no reason to make life any harder for people with a lot of data in M-Prolog format than we have to, you'll think that letting people do read(Term) :- magyar_read(Term). is obviously the way to go. (It doesn't much matter how you install your own code in the hook, the important thing is that there should be a read-hook where you can install your own reader to be used by compile and consult.) > PROLOG CONTROL STRUCTURES AS SYNTAX > > (3) The attempt to describe Prolog control structures as *syntax* > > is fundamentally misdirected. > This is a matter of opinion. One reason for regarding Prolog control > structures as *syntax* is so that a person or program reading > a Prolog program can always recognize its overall structure. It is not a matter of opinion. Either I am right about this, or I am wrong. There is a very important reason for my belief: Prolog is simply not the sort of language for which this kind of thing can WORK. Consider the difference between foo(X, P, Q, L) :- bag(X, (P & Q), L). ^^^^^^^ and de_morgan((P & Q), (R | S)) :- de_morgan(P, R), de_morgan(Q, S). ^^^^^^^ The first is code, and the treatment of it in the grimoire is appropriate. (That is, it will be mapped to whatever "(and ?P ?Q)" would have been mapped to in the BSI Lisp-like syntax.) But the second is data, and the treatment of it in the grimoire is NOT appropriate. It will be mapped to whatever "(and ?P ?Q)" would have been mapped to in the BSI Lisp-like syntax, but it SHOULD be mapped to whatever "[& ?P ?Q]" would be mapped to. If we consider a slightly different example: baz(X, P, L) :- bag(X, P, L). ^ and de_morgan(not(P), R) :- de_morgan(P, R). ^ we find the opposite problem: the second is data and will be mapped to whatever "?P" will be mapped to in the BSI Lisp-like syntax, but the first is code, and should be mapped to whatever "(and ?P)" would be mapped to, BUT IT WON'T BE. The trouble is that the grimoire tries to guess whether something is code or data by looking at its form, but that's the wrong place to look: the place to look is the predicate being called. And the trouble is that we can't build that information into the grammar, because the programmer can define new predicates with code-like arguments. Let me stress this: the whole basis of the build-it-all-into-the-syntax approach is the assumption that code is code and data are data and never the twain shall meet. This is true of Pascal. It is true of Fortran. It is almost true of C. But it is utterly false of Lisp and Prolog. A grammar of this type does not make SENSE for Prolog any more than it makes sense for Lisp. I hereby wager US$100, payable once to Chris Moss, that if the next draft of the grimoire attempts to maintain this rigid distinction between code and data, I will be able to find inconsistencies like the ones above in it. I don't think it's Chris Moss's fault: if anyone can find a way of working around this basic mistake (not HIS mistake, by the way, this is the kind of grammar the BSI committee have always wanted), I'm sure that Chris Moss could. I make my wager *despite* my belief in Chris Moss's competence, because I believe that it is _impossible_ for this approach to work. (If I do not receive said draft by the end of this year, the wager will expire.) > ',' and '&' AS OPERATORS > > Oddly enough, if one takes the grimoire literally, the user CAN > > declare ',' and '&' as operators, and can use them in that form. > > However, ',' and '&' cannot possibly have the same precedence as > > "," or "&" in BSI Prolog, and it seems clear that (A ',' B) and > > (A '&' B) must be different terms. > > It is not intended that it will be possible to declare ',' and '&' > as operators. > There is nothing in the grimoire to say so, and it is a very odd restriction. Intentions are beside the point: all that matters is what the documents actually say. It *is* the intention that it should be possible to write ','(A,B) as a term, and it remains the case that ','(A,B) and '&'(A,B) must be different terms, and if we take the grimoire literally, neither of them can be the same as (A,B) or (A&B). [Yes, I know about (P|Q) and (P;Q) in Dec-10 Prolog. I have always thought and said that this was a mistake, and I think it is one of the very few areas where a difference between the standard and existing practice might be justifiable. ] > QUOTE OPERATORS USED AS OPERANDS > > compare(R, X, Y) :- > > ( X @> Y -> R = > > > ; X @< Y -> R = < > > ; R = = > > ). > > Richard O'Keefe realizes that the above example is intended to be > syntactically incorrect in the standard. When operators are > used as operands, there many problems of possible ambiguity. > A cure is still under discussion, but some problems are > avoided by the rule that "An operator used as an operand must be > bracketted". > Well, it would be more accurate to say that I COMPLAIN that it is intended to be syntactically correct in the standard. There isn't any problem of possible ambiguity here whatsoever. ) :- ( :- must be infix X @> Y @> must be infix Y -> R -> must be infix R = > = must be infix or suffix, has no suffix reading = > ; > must be atom or prefix, has no prefix reading > ; X ; must be infix and so on Now if = and > _both_ had a suffix reading, (R = >) would be ambiguous. Since neither of them has, there is no ambiguity here at all. The elimination of ambiguity is not a very good argument for breaking existing UNAMBIGUOUS code! > NEGATION > > not Goal :- % "not" is not a built-in operator > > ( ground(Goal) -> \+ Goal % neither is "\+". > > ; signal_error(instantiation_fault(Goal,0)) > > ). > It is intended that Standard Prolog will not contain 'not' or '\+'. > Standard Prolog will not require systems to implement true > logical negation and it would be misleading to include an > operator or predicate that implies that they have done so. > Instead the way is left open for processors to implement a version > of 'not' as an extension and still remain standard conforming. > Standard Prolog will contain a built-in predicate > that implements 'negation by failure', i.e. > fail_if(G) :- call(G), !, fail. > fail_if(_). My main point here was a semantic one. Most other control structures are defined in the grammar. It seems odd that ( G -> fail ; true ) should be in the grammar, but that fail_if(G) which is identical in effect, should not. Because one of these forms is in the grammar and the other isn't, they have different properties. For example, ( 1 -> fail ; true ) is syntactically illegal, but fail_if(1) is syntactically legal. There are other differences as well. If BSI Prolog contains fail_if/1, then it WILL contain '\+', but with a different name. Why not use an existing name for an existing operation? Looks to me like nonhicinventusitis. \+ is a crossed-out |-, meaning, obviously enough, "not provable". > A program that resolves ambiguity implicitly is not acceptable as > defining a standard; there must be further definition. > One reason is that a program specifies too much. Some features need to > remain 'implementation dependent' because we must not specify > them, for example: the accuracy and largest values of floating point > numbers, or the integer value corresponding to a character. > > Another reason is that it is harder to understand and find errors. It is harder to understand and find errors in a program you can run than in a never-used-anywhere-else formalism? Judging by the results, this is the opposite of the truth. What is the difference between the public-domain DEC-10 Prolog parser and the BSI grimoire? Both are programs, in a formalism based on logic. Neither is more explicit or less explicit than the other, and both are of similar size. So what is the difference? The difference is that the public-domain DEC-10 Prolog parser CAN be run, HAS been run, and has had most of the mistakes knocked out of it by actual experience. The BSI grimoire is in a new formalism, the definition of which is provided in ***NO*** BSI document (so that I had to keep guessing what things meant), and each of the three drafts I have seen was riddled with errors from end to end. I haven't told you about all the problems I found; there are nearly as many problems as rules! The BSI Prolog group HAVE specified the integer value corresponding to a character: they require the ISO 8859 character set. GREAT! The DEC-10 public-domain ***parser*** does NOT specify the integer value corresponding to a character (that's the tokeniser's job). {The old tokeniser did have ASCII codes built in, but the current version of the tokeniser uses 0'x syntax for the appropriate constants to avoid that problem.} If the BSI committee are so concerned to avoid character code problems, how come they haven't got anything like 0'x or `x` (in a standard which doesn't have to cope with existing code that uses ` as an atom, `x` is a good notation for character code constants)? The public-domain tokeniser doesn't specify anything more about floating point numbers than what they look like, it relies on being provided with a number_chars/2 predicate (which we want ANYWAY) do to the actual conversion. Note that the BSI grimoire says NOTHING about what happens if you write a constant which exceeds the capacity of your implementation. Is the program p(1.2e3456). a BSI-conforming program or not? Well, syntactically it is, but the lexical rules say nothing about what it MEANS. For all that the grimoire or any other BSI document I can recall says to the contrary, a Prolog implementation which reads this as p(0.0). is conforming. This kind of thing is a real portability problem; it exists with respect to integers too. Is 1000000000000000000 a legal Prolog term? According to the grimoire, yes. What does it mean? The grimoire doesn't say. > DISCLAIMER AND CONCLUSION > Never rely on working papers and draft standards. They are subject to > changes and review. All documents and working papers, however > confidently expressed, are also subject to review. There will be no > standard until the member bodies of ISO have approved it. But what ELSE is there to comment on? > Many countries, but not at present USA, have national Prolog panels > coordinating their views on the emerging standard. I encourage all > Prolog implementers and users to participate in this effort in order that > the eventual standard is one that preserves the best of the past > and also provides development paths for the future. > > Roger Scowen, 11 March 1988 Sorry, but it's too late. Prolog implementors and users should have been invited to contribute before the committee went on a four-year binge of inventing their own language. I explicitly suggested some years ago that the people at WISDOM should be invited to participate, and was told that that was out of the question. I have put a lot of effort into writing responses to the BSI stuff, and for all the feedback I've had I might as well have been shouting into a vacuum. The BSI committee having been resolute in their contempt for existing Prolog users (I have repeatedly urged that they should explicitly adopt a principle of not breaking existing code without strong necessity, as the ANSI C committee did, and the last I heard was that they had explicitly rejected any such idea), I cannot regard "preserves the best of the past" as anything but a sick joke. Look, if you want to preserve the best of the past, why have you renamed findall/3 to bag/3? Why have you adopted ESI Prolog-2's streams rather than Arity/Prolog's streams, despite having been told about the problems? Could it be something to do with the fact that the author of that part of the standard worked for ESI, not for Arity? Why have you dropped nl/0 from the standard? Why is there no notation for character constants such as PopLog provides? Why is the error handling facility all new, rather than resembling either IF/Prolog or M-Prolog? I have tried, I really have tried, to arouse interest in the BSI work here in the US. Do you know what has got in the way? As soon as I show people any of the BSI documents (take the 'standardisation issues' documents as an example) they say "what a pack of turkeys" and assure me that there is nothing to worry about. I remain desperately worried that there will be a BSI/ISO Prolog standard, and that it will be as bad as the current drafts, and that it will do a great deal of damage. What *really* worries me is that the people on the BSI committee don't seem to realise how bad it is.