Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!elroy.jpl.nasa.gov!decwrl!deccrl!news.crl.dec.com!shlump.nac.dec.com!decuac!haven!ni.umd.edu!uc780.umd.edu!cs450a03
From: cs450a03@uc780.umd.edu
Newsgroups: comp.lang.misc
Subject: RE: Dynamic typing (part 3)
Message-ID: <1APR91.23564447@uc780.umd.edu>
Date: 1 Apr 91 23:56:44 GMT
References: <1211@optima.cs.arizona.edu> <3073:Mar2820:38:5191@kramden.acf.nyu.edu> <1991Apr1.010526.26781@neon.Stanford.EDU> <boehm.670539603@siria>
Sender: usenet@ni.umd.edu (USENET News System)
Organization: The University of Maryland University College
Lines: 140
Nntp-Posting-Host: uc780.umd.edu

Caution: about 40 lines of quoted material

Hans Boehm    >     [theorist of types and other maths]
Brian Murphy  >>    [your basic dynamic typing advocate]
Dan Bernstein >>>   [bit muncher & tty basher extraordinarie]
and me, your basic wild-eyed pragmatist
>>>The problem with this argument is that neither model is more
>>>general than the other, or simpler, or much more expressive. Given
>>>a statically typed language with structs, an ``all'' type, and the
>>>set of types, you can set up pairs (value,type) and poof! there's
>>>dynamic typing. Given a dynamically typed language with assertions,
>>>you can assert that a value has some type and poof! there's static
>>>typing.  [then mentions efficiency of static implementations]

Ok, poof, no argument.  And I almost hesitate to mention that static
language implentations have an efficiency advantage (as they better
model things like machine limits) while dynamic language
implementations have an efficiency advantage (they better model human
thought mechanisms as exemplified by thousands of years of
mathematical development).

>>... your statically-typed language _requires_ type declarations,
>>whereas in a dynamically-typed language we can get by without them.

er... which is an example of what I was trying to say.

> The main remaining problem then is that there are a huge number of
>different formulations of the type inference problem, which vary
>greatly in the programs to which they can assign types, especially if
>those programs are library functions for which we have incomplete
>information about the calling context.  Most systems for assigning
>types to expressions in dynamically typed languages seem to assign
>basically simple types. ...

er... simplicity is often an advantage.  

>For statically typed languages without subtyping, I know of the
>following type inference problems that have been studied:
>  [ 1. (simple) ... 6 (not simple) ]
>My impression is that the type inference procedures usually proposed
>for dynamically typed languages usually are near the weak end of this
>scale.  If they were near the strong end, they would effectively have
>something similar to a static type system, with nontrivial programmer
>supplied type assertions.
>
>This leaves some interesting questions I can't answer:
> 
>1. How much performance do you lose by performing type inference
>without a notion of polymorphic types?  This is probably environment
>dependent ...

environment and application dependent, otherwise you have no metric
for performance.  Worse, it is rather impractical to rigidly define
"without a notion" (not impossible, mind you).

>2. Why do most of us program in statically typed languages that are
>so near the weak end of the above scale that this is all an academic
>excercise?  Ada and C++ are barely at level 2, if you stretch things,
>and they do require lots of explicit type information.

Caution: Rampant assertions follow...

In a word, expressiveness.  Typing, is very similar to side-effect
driven programming, and thus difficult to manage if complex.  The
advantage of static typing in programming comes from the ability of
the typing system to reflect and model machine limits.  To take typing
beyond this generally means an attempt to characterize computer
routines as relations, and to re-define them in terms of abstract
domains.

In other words, strong typing attempts to make you write each computer
"function" twice: once in terms of step-by-step instructions as to
what needs to be done, then again in terms of step-by-step
instructions as to what it is allowed to do.

It is simpler, I believe, to just write the thing once.

Or rather, I should say it's simpler for the programmer.  The closer
you can get to machine language, the simpler things are for the
computer.  But as has been observed many, many times:  10% of the code
takes 90% of the time to execute (or similar statistics).

Assuming that programmer time is limited, it makes sense to profile
your code and spend most of the programming effort on that 10% CPU
hog.  (Which, if I understand aright, is what Dan Bernstein is doing:
working on optimizing some problems which are CPU hogs on current
machines.)

Now, it probably some of the topologies that are being developed to
describe type systems will have applications which are quite useful.
On the other hand, there are some very rich bodies of mathematics
which have been developed without these formalisms.  Which is to say,
"typing can be used to describe a programming" does not mean "typing
must be used to describe a program".

More specific to programming, it is often more useful to have
miscelaneous information about a function (does argument order matter?
is it pure computation or does it deal with external objects? does
this function apply to any representation of number for which we have
a math library, or only a specific word size and specific coding? ...)
than it is to concentrate on a completely general description of the
function.

Finally, note that describing functions as static objects is an
outgrowth of a couple very good assembly language practices: 
(1) describe what's allocated where, and (2) keep the code constant so
you can trust it.

On the other hand, I find myself more comfortable with a language
which allows me to "build up" a function from component functions and
utilities than I do with a language which wants me to write as if I
were allocating raw memory.  In other words, if I have some general
function f, it is nice to be able to use it, in all its gory
inefficiency and then, once it is tested, make a function g which is f
restricted to some domain chosen to reflect properties of the specific
problem and the machine I'm working on.

I suppose this is partially personal preference, but note that the
90/10 "rule" implies this is a good approach.

Also note that this system tends to favor simple types.  Function
domain is, of course, part of the function definition, but there is a
difference between the overall limits a function has for correctness,
and the specific limits imposed on a function for machine efficiency.

The typing system built into each function, for correctness, is (in
general) unique to the definition of that function.  In practice, this
means you can even define functions which have no domain (always give
a domain error on closure), and other odd things like that.  Note that
this typing usually becomes less complicated when you restrict the
problem to a specific machine domain (e.g. 32 bit signed integers).

I'm leaving out some key issues (such as the ability of a language to
deal with structured data, which is something type systems often try
to address), but I think you'll find this rule applies in those areas
too:

    The more direct you are, the simpler the problem.

Raul Rockwell