Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watnot!watmath!clyde!rutgers!seismo!mcvax!lambert From: lambert@mcvax.UUCP Newsgroups: comp.lang.misc Subject: Re: Static typing with Dynamic binding Message-ID: <7280@boring.mcvax.cwi.nl> Date: Wed, 25-Feb-87 01:06:42 EST Article-I.D.: boring.7280 Posted: Wed Feb 25 01:06:42 1987 Date-Received: Fri, 27-Feb-87 21:07:15 EST References: <364@oracle.tc.fluke.COM> <814@unc.unc.UUCP> Reply-To: lambert@boring.UUCP (Lambert Meertens) Distribution: comp.lang.misc Organization: CWI, Amsterdam Lines: 142 In article <1075@tekchips.TEK.COM> willc@tekchips.UUCP (Will Clinger) writes: > For example, here is a legal DSP program: > > [...] > if flag then glumpf := foo(36) else glumpf := bar(37) > [...] > If the definition of a2 is changed to [...] > x := '49 * 37 = '; /* if Pascal had strings */ > then the program ought to be legal but, according to Moen's rules for > DSP, it isn't. [...] Moen's rules do not quite require that all > variables named x have the same type, but they come close. This shows, clearer than previous postings, a source of confusion in the dynamic/static type checking debate. (At the end of this article I point out some more sources of confusion.) A requirement is imposed here on static type checking that is more stringent than is usually imposed on it in the context of statically bound languages. Consider the following "Theorem": static type checking is impossible in ALL powerful languages (e.g. Pascal). "Proof": Consider begin x: integer; if then x := 123 else x := 'Guvf pnaabg unccra'; print(x) end. Since x is declared integer, it is illegal to assign a string to x. But the code assigning the string will never be reached, so no illegal situation will arise. So this is type-correct. A static type checker that can decide this must be able to solve arbitrary complex yes/no problems, which by Goedel is impossible. The problem with this is that we happily implement static type checking under the assumption that all conditions can return both true and false, that generators in for-loops may always generate at least one value, etc. This imposes some restriction, but I do not see why this *by itself* is an undue restriction. This can only be judged in the full context of a language. Think of procedure calls as being statically macro expanded to some level N. (We need a cut off because of recursion. I am ignoring the problem of procedure variables for the sake of simplicity.) For dynamically bound languages we can almost literally expand; for statically bound languages we have to "shield" the bodies first to prevent naming conflicts. Static type-correctness can then be formulated as: no name (or literal) is used inconsistently within a single scope, however large N is chosen. It depends on details of the language what is to be understood to constitute inconsistency. Some languages (e.g. ALGOL 68) have the property that this check is decidable, others (e.g. ALGOL 60) that it is undecidable (as shown by Langmaack). But if ALGOL 60 had dynamic binding, the check would become decidable. What about that! [>> = <624@watcgl.UUCP> kdmoen@watcgl.UUCP (Doug Moen)] >> .... If you tried to write a real program in DSP, >> then you would find that most procedures and functions would end up >> with rather long import lists. > > This would imply that most variables would indeed fall under the > restriction that variables of the same name must have the same type. > >> .... But that's okay, because DSP exists >> solely to prove that it is possible for a statically typed, dynamically >> scoped language to exist. > > Fair enough. I think it should also be clear why such a language is > undesirable. If "such a language" is intended to be more general than just DSP (Dynamically Scoped Pascal), then the conclusion is unwarranted. DSP was invented on the spot by Doug Moen as a hacked variation on Pascal just to show that dynamic binding and static checking are not incompatible per se. A carefully designed language need not have the same problem. Further sources of confusion: On the nature of types. I assume that values such as can arise at run- time have a well-understood type. Call the set of all such types the Vtypes. A type-checking algorithm will try to assign static types to expressions, in particular names. Call these the Stypes. These Stypes are not necessarily taken from the set of Vtypes. For example, in ALGOL 68 the Stypes contain union-types, but a Vtype is never a union-type but simply the actual variant from the union. In polymorphic languages (ML, Miranda, ABC, ...) an Stype is a polymorphic type (containing type variables) that can be instantiated to a Vtype. A good way of thinking about this is to view an Stype as representing a SET of Vtypes. In many languages each Stype gives a singleton set so that the Stypes can be identified with Vtypes; hence the confusion to think that this is necessarily the case. In ALGOL 68, an Stype can represent a set of Vtypes if it is a union-type, but always a finite set. In polymorphic languages, Stypes can model infinite sets (e.g., LIST(*) --> {LIST(number), LIST(string), LIST(LIST(boolean)), ... }.) We cannot compute with infinite sets, so we need a finite representation. This representation must be such that if e.g. @ is an operation of the language and applying @ to a value of type t gives a value of type f(t), then for each Stype T the set {f(t) | t in the types represented by T}, if all these f(t)'s are defined (otherwise we have a type conflict) must also be representable as an Stype. The presence of type-revealing declarations can dramatically simplify static type checking, but is in no way essential. If a decision procedure for type correctness exists for a language with declarations, there also exists an algorithm that will take a source text with typing information from declarations omitted and that will output a type-correct source text with this information restored if that is possible, and otherwise terminate with a message "program cannot be typed". A final confusion I noticed is the tacit assumption that a type-checking algorithm can only assign types to variables. In general, such an algorithm may assign Stypes to all "names", including functions, operation symbols, and for Smalltalk possibly even messages (not that I see a necessity for this at the moment). A conceivable Stype for Smalltalk might be something like: Takes a message "foo" with an object x, but requires that x understands the message "bar". Such Stypes can be computed in principle without requiring declarations, import statements or other additions to the text. (These Stypes can be mutually recursive, in which case it is best to represent them somehow as nodes in a directed graph.) An object of this type will accept "foo" messages with many different Vtype type of x's, as long as they understand "bar". It may happen that for some x's during actual execution the code sending "bar" to them is not reached. This is the same as with the unreachable x := 'Guvf pnaabg unccra'. If some programmers think this imposes an undue restriction on their coding practice, I would like to see an example of reasonably understandable and maintainable code where a weaker form of typing is helpful. Even if there are examples, which I frankly speaking doubt, the merits have to be weighed against the substantive merits of some form of type checking before execution, which if strong enough will tell you about most of the silly errors you made. -- Lambert Meertens, CWI, Amsterdam; lambert@cwi.nl