Xref: utzoo comp.lang.functional:131 comp.lang.prolog:2730 Path: utzoo!attcan!uunet!samsung!munnari.oz.au!bruce!goanna!ok From: ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) Newsgroups: comp.lang.functional,comp.lang.prolog Subject: Re: Pattern matching considered harmful Message-ID: <3077@goanna.cs.rmit.oz.au> Date: 26 May 90 09:00:22 GMT References: <2584@skye.ed.ac.uk> Followup-To: comp.lang.prolog Organization: Comp Sci, RMIT, Melbourne, Australia Lines: 112 In article <2584@skye.ed.ac.uk>, jeff@aiai.ed.ac.uk (Jeff Dalton) writes: > More generally, pattern matching would be ok if, for a given structure > type (or Prolog term or whatever), it were used only in a small set of > basic operations. However, programmers all too often use pattern > matching throughout their programs and then, when they change the > number or order of fields in a structure, have to go around and change > all the patterns. Moreover, to read a pattern one often has to count > commas, which is error-prone to say the least. I am very much in agreement with this. As an example, suppose you have some data type which represents a finite function (a dictionary). For argument's sake, suppose you start out with the idea that a dictionary is empty_dictionary or non_empty_dictionary(Key,Val,Rest) Suppose you find that part of your program wants to find dictionary entries satisfying a test p(Key,Val). You might be tempted to write dictionary_p_entry(non_empty_dictionary(Key,Val,_), Key, Val) :- p(Key, Val). dictionary_p_entry(non_empty_dictionary(_,_,Rest), Key, Val) :- dictionary_p_entry(Rest, Key, Val). If you later notice that most of the program is looking up known keys, and decide to use AVL trees instead, you're stuck with a big change. But if you had written a "dictionary.pl" file defining the operations which know about the structure of a dictionary, you might have had a dictionary_member(Dictionary, Key, Val) predicate (which you would have had _anyway_ to look things up!) and you would just have written dictionary_p_entry(Dictionary, Key, Val) :- dictionary_member(Dictionary, Key, Val), p(Key, Val). Note that specifying the interface predicate dictionary_member/3 _logically_ pays off: we can use the same predicate to look for known keys or to enumerate keys, and have just _one_ predicate to remember. (How that works inside the package is solely the business of the "dictionary.pl" file.) It really is a good idea to isolate knowledge of the constructors of a data type in a single file. It needn't make your program inefficient; on the contrary, it is easier to get the design of a predicate like dictionary_member/3 _right_ and the implementation fast when you have to do it just once than to do it over and over again when you have basically the same code scattered all over the place. > This sort of thing happens all the time in Prolog and probably in > functional languages as well. Yes to both in general, no to both in the work of good programmers. > Some Prolog programmers have told me > that, of course, one could use a notation that involved named fields > and then expand (unfold) this into patterns. Few programmers actually > do this, however. Code to do it is in the book... If you have a Prolog which supports term_expansion/2 or a close analogue (C Prolog, Quintus Prolog, NU Prolog, ZYX Prolog, SICStus Prolog, possibly others. The mechanism in ALS Prolog is unfortunately placed. Prolog-2 hasn't got it at the moment, but if I am reading the manual right the basic machinery is in there _somewhere_ so they could add it) it is not too hard to write a drastically simplified analogue of Lisp's defstruct. No new notation is needed. One just follows the existing convention for selecting from a data structure: predicate(Selector argument(s), Whole structure, Part selected) For example, if you wanted a 'date' data type, you might use date(year, Date, Year) % Year = date$year(Date), &c date(month, Date, Month) date(day, Date, Day) In fact, if you are keeping the major operations on your data types in their own files (where the implementation of the type is legitimately visible) leaving field accesses as calls like this is unlikely to be a major factor in the cost of your program. > Given these problems, I wonder why pattern-matching seems to be > regarded, rather uncritically, as a Good Thing. Regard it critically and it is still a WONDERFUL thing. The problems are not specific to pattern matching. *ANY* "leakage" of information about the implementation of a type is going to cause *exactly* this kind of problem. To back this up, suppose we are implementing dates. Suppose that I decide to represent dates as the number of microseconds since the midnight before 1-Jan-1970, stored as a floating-point number. (IEEE, VAX, and /370 double precision floats have enough bits to do this for a useful range of years, and additions and subtractions will be exact.) Then a programmer might exploit his knowledge of this fact by using Date1 < Date2 % Date1 occurs before Date2 If I then try to port the program to a Prolog with 28-bit floats (such as C Prolog or SB Prolog) I'm going to have to change the representation to some kind of compound term. Boom! < doesn't like it any more. If I represent dates as timestamp(Year,Month,Day,Hour,Minute,Second,Millisecond,MicroSecond) I'll still be able to use @< to compare dates, but not < . > Abstraction is a wonderful thing, when it's used. Indeed and it is, but pattern matching isn't the only way of breaching it. I would point out that Hope and ML and some other functional languages have module systems in which pattern matching can be used quite safely; you can export a type from a module without exporting the constructors. It was always my intention to extend the Mycroft/O'Keefe type checker to enforce abstract data typing in just that way. Maybe I'll get around to it... -- "A 7th class of programs, correct in every way, is believed to exist by a few computer scientists. However, no example could be found to include here."