Path: utzoo!utgpu!watserv1!watmath!att!ima!esegue!compilers-sender From: stephen@estragon.uchicago.edu (stephen p spackman) Newsgroups: comp.compilers Subject: Re: Defining polymorphism vs. overloading Keywords: design, polymorphism Message-ID: <1990Sep03.003642.5250@esegue.segue.boston.ma.us> Date: 3 Sep 90 00:36:42 GMT References: <9008310419.AA06194@karakorum.berkeley.edu> Organization: Compilers Central Lines: 130 Approved: compilers@esegue.segue.boston.ma.us In-Reply-To: oliver@karakorum.berkeley.edu's message of 31 Aug 90 04:19:43 GMT In article <9008310419.AA06194@karakorum.berkeley.edu> oliver@karakorum.berkeley.edu (Oliver Sharp) conjectures: o Overloading means using the same name (or symbol) to invoke different code depending on the types of the arguments (or operands). So, an example is +, which may do radically different things for integers, floats, and complex numbers. In other words, you have different chunks of code depending on the type. You may need to do some implicit casting, as in promoting an integer to a complex number, before you can actually perform the operation. o Polymorphism means using the same piece of code to operate on objects of different types. In LISP, cons is polymorphic because you can give it arguments of different types but it goes ahead and does its thing without worrying about it. I'd say that just about hits the nail on the head (-: in my humble idiolect :-). Overloading implies multiple implementation discriminating by type, while polymorphism implies multiple interpretation (of the same code). I don't feel, however, that we can allow the coercion of the arguments to make an overload apply as part of the overloading process; rather, any function that causes its arguments to be coerced is, if it is a proper property of the function, POLYMORPHIC in that respect (though it may also be overloaded) - because the coercion process is just a (trivial) mechanism to allow the same implementation text to manipulate arguments of different types. On the other hand, it should also be borne in mind that one (and to my mind, the neatest) model of coercion is that it is itself the application of an OVERLOADED function with a (textually) null name {footnote: We have to resolve an ambiguity here, of course: between any two symbols, how many null names did we read? Workable solutions, at different points along the syntax/semantics scale, include requiring (a) that all coercion paths commute (this is so semantic that you'd want to implement it by restricting the forms or the directions of the coercions), (b) that all coercion paths, including T->T, have length 1, or (c), and most subtly, that coercion paths of length 1 are preferred to coercion paths of any other length (if we include zero we can work implicit normalisers into this framework), and that any ambiguity not resolved by this rule is forbidden. Is this approach (a) stupid, (b) obvious, (c) known or (d) in need of writing up someplace, does anyone know?}. So: o Overloading is type-resolved multiple implementation. o Polymorphism is type-resolved multiple interpretation. o Coercion is, in effect, polymorphism implemented though CALL-SIDE normalisation by a distinguished overloaded function (unlike the other two, of course, this can't affect the result type, though in the other cases it need not, either). o There is also a fourth mechanism, dependent typing; this will allow us either to build explicitly type-parameterised functions (if we have dependent maps (ITT PI types)) or dynamically typed systems (if we have dependent products (SIGMA types)). {footnote: Going one step further and giving types properties, we get late-resolving object-orientation a la Smalltalk (which is NOT polymorphism because Smalltalk objects, since they "know" their class dynamically, only have a single type IMHI).} Naively, there is no ambiguity at all; naive polymorphism would be compiled by macro expansion (implying that C's .[.] operator is PRECISELY polymorphic - it is, as I recall, surface syntax for *(.+.)), while naive overloading can be seen as including parts of the type signature in an object's name. That's pretty much the Ada model for both (I'm sure I'll be corrected if I misremember me). Things *can* get murky though. There are obviously other ways of getting multiple implementations; any clausal-equational definition style will give you this;- 0! = 1; (n : N)' ! = n' * n!; What this implies is that as soon as types acquire value-lke properties (such as an however-indirectly user-accessible type comparison), the distinction rapidly dissolves; f (x : (X : TYPE)) = .if X = T .then tf x .else uf x . provides only one body, but how different is it from f (x : T) = tf x; f (x : U) = uf x; ? In case user-accessible type comparison seems far-fetched, incidentally, note that the rather (IMHO) mathematically ill-considered unions of Algol68 relied on them, and that any language with non-injective coercions induces (an admittedly not very usable) one. Maybe I should also comment that from a type-inference point of view, polymorphism is a little more straightforward, since it only implies that we have some kind of quantification over types, while overloading implies that we can discriminate them (a much stronger requirement, especially for those of us uncomfortable with the everything-is-a-set outlook). But from an implementation (particularly a compilation) standpoint, overloads are more static and therefore nicer: you get less ucky descriptors and quasi-interpretive code. In both domains the distinction is pretty important from a TECHNICAL point of view, even though it can be difficult-to-impossible for the user of a function to tell which he(gender apology, but syntax requires)'s got. Finally, we need to observe a curious fact: that while we can characterise langauges according to which of polymorphism and overloading they provide to the user, you would not normally have user-polymorphism without overloading in the implementation. The reason for this is that any polymorphic text is going to have to resolve to something executable, and unless you are going to construct everything from combinators and operations manipulating objects with homogeneous underlying representations (such as pointers), this implies the existence of illative atoms (like, say, +) with overloaded implementations. (-: Having just taken a Linguistics class from Lakoff, I could explain all this away with a lot of learned talk about radial categories, and prototypical cases, and so forth ... but I'm really curious. What do YOU think the distinction is? Or isn't there one? Do you use these terms consistently? I'd personally be interested and amused to hear you do just that; but maybe this isn't the forum for it...? :-) stephen p spackman stephen@estragon.uchicago.edu -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.