Path: utzoo!mnetor!uunet!mcvax!ukc!eagle!icdoc!ivax!cdsm From: cdsm@ivax.doc.ic.ac.uk (Chris Moss) Newsgroups: comp.lang.prolog Subject: Re: behavior of read/get0 at end_of_file Message-ID: <243@gould.doc.ic.ac.uk> Date: 25 Mar 88 13:16:24 GMT References: <608> <1197@kulcs.kulcs.uucp> <783@cresswell.quintus.UUCP> <518@ecrcvax.UUCP> <801@cresswell.quintus.UUCP> Sender: news@doc.ic.ac.uk Reply-To: cdsm@doc.ic.ac.uk (Chris Moss) Organization: Dept. of Computing, Imperial College, London, UK. Lines: 99 Keywords: get0 read end_of_file In article <801@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: rok>I find it extremely odd to call a string of length one a character. rok> ... But it because rather more rok>clumsy on the D machines, which have a 16-bit character set. (Can you say rok>"Kanji"? I knew you could.) Yes, the BSI committee is just beginning to face up to this problem, as the Japanese have just started taking an interest... As Richard points out, it's not much problem for a character based definition, which I personally would favour. rok>The fail-at-end approach forces us not only to do something special rok>with the get0/1 in rest_identifier/3, but in everything that calls it. rok>(In the Prolog tokeniser, there are two such callers.) rok> rok>The point is that if-then-elses such as Meier suggests start rok>appearing all over the place like maggots in a corpse if you adopt rok>the fail-at-end approach, to the point of obscuring the underlying rok>automaton. I think this is a fair point when looking at the definition of lexical analysers, however... mmeier> I must say, none of the two seems to me satisfactory. Richard's mm> version is not portable due to the -1 as eof character. A character definition which included a (special) end-of-file token would be better. mm> BSI objects that if [read/1] returns e.g. the atom end_of_file mm> then any occurrence of this atom in the source file mm> could not be distinguished from a real end of file. rok> rok>That's not a bug, it's a feature! I'm serious about that. I don't think that is any better than most uses of that particular argument. Sure, if you learn to live with it you can find uses for it. rok>Before taking end_of_file away from me, the BSI committee should supply rok>me with a portable way of exiting a break level and a reliable method of rok>leaving test cases in a file without having them always read. And this is the death of any standardization process! I have yet to find the document that Richard referred to (a few days ago) when he claimed that the BSI's mandate was to standardize Edinburgh Prolog. It certainly hasn't been repeated in all the other formal presentations that have been made to BSI or ISO. But if one has to follow every wrinkle of an implementation just because it represents (arguably) the most popular dialect, then why don't we just appoint IBM to write all our standards for us (or Quintus or ...)? [And who is the TRUE inheritor of the title "Edinburgh Prolog" anyway? Is it the commercial product (formerly NIP) now being sold under that title?] To return to the argument, I think there's a significant difference between get0 and read. Having an end-of-file marker for read is (almost never) used to implement finite-state-machines. Instead it is used for repeat-fail loops. e.g. go :- repeat, read(Term), (Term=end_of_file -> true; process(Term), fail). Now in the days before tail recursion and all the other optimizations this was inevitable. But why should we encourage this approach today? The above clause is a good example of the trickiness of "repeat". I always write repeat loops wrong first time and this was no exception. I put (Term=end_of_file -> true; process(Term)), fail. then changed it to (Term=end_of_file -> !; process(Term)), fail. before settling on the above version. I personally think "repeat" should be left out of the standard (there's no penalty overhead in not having it built-in these days anyway). Don't other people have my problem? It would seem to encourage better programming if we allowed "get0" (or get_file or whatever) to return an end-of-file token, and any high-level routines to fail at end-of-file. It's not particularly consistent, but I don't know whether that's a priority in this case. rok>In fact, on my SUN right now I have function key F5 bound to rok>"end_of_file.\n" so that I can get out of Prolog without running the rok>risk of typing too many of them and logging out. I seem to get by perfectly well by setting "ignoreeof" in my cshell! rok>Ah, you'll say, but that's what nested comments are for! rok>Well no, they don't work. That's right, "#| ... |#" is NOT a reliable rok>way of commenting code out in Common Lisp, and "/* ... */" is NOT a rok>reliable way of commenting code out in PopLog. That seems to be the best argument for allowing end-of-line comments in Prolog. Now where do I find the Emacs macro for commenting out all lines between dot and mark (and removing such comments)? Chris Moss Disclaimer: unless I say otherwise I am expressing my personal opinions NOT the opinions of any committee!