Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!elroy.jpl.nasa.gov!decwrl!deccrl!news.crl.dec.com!shlump.nac.dec.com!decuac!haven!ni.umd.edu!uc780.umd.edu!cs450a03 From: cs450a03@uc780.umd.edu Newsgroups: comp.lang.misc Subject: RE: Dynamic typing (part 3) Message-ID: <1APR91.23564447@uc780.umd.edu> Date: 1 Apr 91 23:56:44 GMT References: <1211@optima.cs.arizona.edu> <3073:Mar2820:38:5191@kramden.acf.nyu.edu> <1991Apr1.010526.26781@neon.Stanford.EDU> Sender: usenet@ni.umd.edu (USENET News System) Organization: The University of Maryland University College Lines: 140 Nntp-Posting-Host: uc780.umd.edu Caution: about 40 lines of quoted material Hans Boehm > [theorist of types and other maths] Brian Murphy >> [your basic dynamic typing advocate] Dan Bernstein >>> [bit muncher & tty basher extraordinarie] and me, your basic wild-eyed pragmatist >>>The problem with this argument is that neither model is more >>>general than the other, or simpler, or much more expressive. Given >>>a statically typed language with structs, an ``all'' type, and the >>>set of types, you can set up pairs (value,type) and poof! there's >>>dynamic typing. Given a dynamically typed language with assertions, >>>you can assert that a value has some type and poof! there's static >>>typing. [then mentions efficiency of static implementations] Ok, poof, no argument. And I almost hesitate to mention that static language implentations have an efficiency advantage (as they better model things like machine limits) while dynamic language implementations have an efficiency advantage (they better model human thought mechanisms as exemplified by thousands of years of mathematical development). >>... your statically-typed language _requires_ type declarations, >>whereas in a dynamically-typed language we can get by without them. er... which is an example of what I was trying to say. > The main remaining problem then is that there are a huge number of >different formulations of the type inference problem, which vary >greatly in the programs to which they can assign types, especially if >those programs are library functions for which we have incomplete >information about the calling context. Most systems for assigning >types to expressions in dynamically typed languages seem to assign >basically simple types. ... er... simplicity is often an advantage. >For statically typed languages without subtyping, I know of the >following type inference problems that have been studied: > [ 1. (simple) ... 6 (not simple) ] >My impression is that the type inference procedures usually proposed >for dynamically typed languages usually are near the weak end of this >scale. If they were near the strong end, they would effectively have >something similar to a static type system, with nontrivial programmer >supplied type assertions. > >This leaves some interesting questions I can't answer: > >1. How much performance do you lose by performing type inference >without a notion of polymorphic types? This is probably environment >dependent ... environment and application dependent, otherwise you have no metric for performance. Worse, it is rather impractical to rigidly define "without a notion" (not impossible, mind you). >2. Why do most of us program in statically typed languages that are >so near the weak end of the above scale that this is all an academic >excercise? Ada and C++ are barely at level 2, if you stretch things, >and they do require lots of explicit type information. Caution: Rampant assertions follow... In a word, expressiveness. Typing, is very similar to side-effect driven programming, and thus difficult to manage if complex. The advantage of static typing in programming comes from the ability of the typing system to reflect and model machine limits. To take typing beyond this generally means an attempt to characterize computer routines as relations, and to re-define them in terms of abstract domains. In other words, strong typing attempts to make you write each computer "function" twice: once in terms of step-by-step instructions as to what needs to be done, then again in terms of step-by-step instructions as to what it is allowed to do. It is simpler, I believe, to just write the thing once. Or rather, I should say it's simpler for the programmer. The closer you can get to machine language, the simpler things are for the computer. But as has been observed many, many times: 10% of the code takes 90% of the time to execute (or similar statistics). Assuming that programmer time is limited, it makes sense to profile your code and spend most of the programming effort on that 10% CPU hog. (Which, if I understand aright, is what Dan Bernstein is doing: working on optimizing some problems which are CPU hogs on current machines.) Now, it probably some of the topologies that are being developed to describe type systems will have applications which are quite useful. On the other hand, there are some very rich bodies of mathematics which have been developed without these formalisms. Which is to say, "typing can be used to describe a programming" does not mean "typing must be used to describe a program". More specific to programming, it is often more useful to have miscelaneous information about a function (does argument order matter? is it pure computation or does it deal with external objects? does this function apply to any representation of number for which we have a math library, or only a specific word size and specific coding? ...) than it is to concentrate on a completely general description of the function. Finally, note that describing functions as static objects is an outgrowth of a couple very good assembly language practices: (1) describe what's allocated where, and (2) keep the code constant so you can trust it. On the other hand, I find myself more comfortable with a language which allows me to "build up" a function from component functions and utilities than I do with a language which wants me to write as if I were allocating raw memory. In other words, if I have some general function f, it is nice to be able to use it, in all its gory inefficiency and then, once it is tested, make a function g which is f restricted to some domain chosen to reflect properties of the specific problem and the machine I'm working on. I suppose this is partially personal preference, but note that the 90/10 "rule" implies this is a good approach. Also note that this system tends to favor simple types. Function domain is, of course, part of the function definition, but there is a difference between the overall limits a function has for correctness, and the specific limits imposed on a function for machine efficiency. The typing system built into each function, for correctness, is (in general) unique to the definition of that function. In practice, this means you can even define functions which have no domain (always give a domain error on closure), and other odd things like that. Note that this typing usually becomes less complicated when you restrict the problem to a specific machine domain (e.g. 32 bit signed integers). I'm leaving out some key issues (such as the ability of a language to deal with structured data, which is something type systems often try to address), but I think you'll find this rule applies in those areas too: The more direct you are, the simpler the problem. Raul Rockwell