Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!ucsd!ucbvax!agate!bionet!parc!boehm From: boehm@parc.xerox.com (Hans Boehm) Newsgroups: comp.lang.misc Subject: Re: Dynamic typing (part 3) Keywords: type inference static dynamic lisp ml Message-ID: Date: 1 Apr 91 21:00:03 GMT References: <1211@optima.cs.arizona.edu> <3073:Mar2820:38:5191@kramden.acf.nyu.edu> <1991Apr1.010526.26781@neon.Stanford.EDU> Sender: news@parc.xerox.com Organization: Xerox PARC Lines: 82 brm@neon.Stanford.EDU (Brian R. Murphy) writes: >In article <3073:Mar2820:38:5191@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >>The problem with this argument is that neither model is more general >>than the other, or simpler, or much more expressive. Given a statically >>typed language with structs, an ``all'' type, and the set of types, you >>can set up pairs (value,type) and poof! there's dynamic typing. Given a >>dynamically typed language with assertions, you can assert that a value >>has some type and poof! there's static typing. A good preprocessor will >>smooth out the differences either way; so, all else being equal, you >>might as well choose the model that produces better code, i.e., static >>typing (at least with current technology). >Unfortunately, given a statically-typed language with higher-order >functions and an "all" type, type inference appears to be >undecideable. Thus your statically-typed language _requires_ type >declarations, whereas in a dynamically-typed language we can get by >without them. I think this is getting into an area where we need a bit more precision. The argument implied by Brian seems to be that with a dynamically typed language and some automatic type inference, we can get similar performance to a statically typed language. I think we are no longer addressing reliability issues. The main remaining problem then is that there are a huge number of different formulations of the type inference problem, which vary greatly in the programs to which they can assign types, especially if those programs are library functions for which we have incomplete information about the calling context. Most systems for assigning types to expressions in dynamically typed languages seem to assign basically simple types. They might determine that 3+5 has type integer, but they won't determine that the identity function has type T -> T, for any T. (It sounds like the reference given by Brian is an exception, but it appears to have other problems.) In more practical terms, if list operations aren't built in to the language, and their implementation is in a library, unavailable for reanalysis by the type checker, then I have no way to determine that head(cons(1,NIL)) has type integer. (This is of course exactly the same problem that C++ s type system has with void *; I lose information that's easily statically available. You have no way of keeping track of the fact that two void *'s are the same type.) For statically typed languages without subtyping, I know of the following type inference problems that have been studied: 1. Only simple (Pascal-like) types are assigned. Type inference is easily possible. 2. ML-style polymorphic types are assigned. (The identity function gets a reasonable type. The function that applies its argument to itself (lambda x . x x) is not typable, even though this function can reasonably be applied to the identity function.) Type inference is decidable (and done in ML.) 3. ML modified to allow "polymorphic" functions to be used on the right side of their own (recursive) definitions. Type inference is undecidable. 4. The Girard-Reynolds lambda-calculus. Functions can be parametrized with respect to polymorphic functions. Lambda x . x x can be assigned a type. Type inference is very sensitive to the problem statement. It's either known to be undecidable or an open problem, depending on the exact statement. But there are possible compromises involving smallish amounts of explicit type information. 5. Any of the above with subtyping. I'm not up on all the results here. 6. Some of the above with certain recursive types getting automatically inferred. My impression is that the type inference procedures usually proposed for dynamically typed languages usually are near the weak end of this scale. If they were near the strong end, they would effectively have something similar to a static type system, with nontrivial programmer supplied type assertions. This leaves some interesting questions I can't answer: 1. How much performance do you lose by performing type inference without a notion of polymorphic types? This is probably environment dependent ... 2. Why do most of us program in statically typed languages that are so near the weak end of the above scale that this is all an academic excercise? Ada and C++ are barely at level 2, if you stretch things, and they do require lots of explicit type information. Hans