Path: utzoo!attcan!uunet!tut.cis.ohio-state.edu!att!dptg!ulysses!andante!alice!ark From: ark@alice.UUCP (Andrew Koenig) Newsgroups: comp.object Subject: Re: code blocks (aka closures) Message-ID: <11002@alice.UUCP> Date: 1 Jul 90 17:31:19 GMT References: <12396@june.cs.washington.edu> <1112@carol.fwi.uva.nl> <5305@stpstn.UUCP> Organization: AT&T Bell Laboratories, Liberty Corner NJ Lines: 134 There are three problems in extending a language like C to support closures (Smalltalk blocks, Lisp lambda expressions in Lisps with lexical scoping, etc). The first problem is that the calling sequence has to be changed so that a function always has available the location of the appropriate stack frame of the lexically surrounding scope. For example (and I'm going to keep these examples in C rather than C++ to show that they are universally applicable), here's a function that applies a given function to each element of an integer array: void apply(int *a, int n, void f(int)) { int i; for (i = 0; i < n; i++) f(a[i]); } Before you flame me for writing `void f(int)' instead of `void (*f)(int),' note that that is an acceptable abbreviation in ANSI C. Now, let's use the `apply' function to add up the elements of an array: static int sum; static void add(int i) { sum += i; } int sigma(int *a, int n) { sum = 0; apply(a, n, add); return sum; } This solution is unsatisfying because it uses a global variable. Even though I can sort of hide that global variable by declaring it `static,' I'm sure you can see that its mere existence will cause trouble in more complicated contexts. For example, a similar program intended to add all the elements in a tree would have to do so by having the equivalent of the `add' function make recursive calls on the `sigma' function; this wouldn't work as shown here because sigma uses a static accumulator. What we really want is the ability to make `sum' local to `sigma,' which also requires making `add' local to `sigma:' int sigma(int *a, int n) { int sum = 0; void add(int i) { sum += i; } apply(a, n, add); return sum; } This is precisely the thing that most present C implementations cannot do. For the program above to work, it is necessary for `add' to be able to get at the corresponding stack frame for `sigma.' Moreover, there may be more than one stack frame in existence, so some kind of bookkeeping is necessary to determine the right one. This problem is well known, of course, and lots of languages have solved it, but Dennis Ritchie made a conscious decision not to impose the associated overhead on C and that decision has stuck. Anyway, suppose we had an extended C that allowed this kind of thing. The next step would be to realize that there is really no need to give a name to `add' at all -- in principle one could include it directly in the argument list for `apply:' int sigma(int *a, int n) { int sum = 0; apply(a, n, void(int i) { sum += i; }); return sum; } I am ignoring the likely parsing ambiguities that stem from this particular notation; I am sure they can be worked out and I have heard (though am not actually familiar with the details) about one C compiler that has implemented just this extension. The second problem is that when one has local functions, it is tempting to say that one should be allowed to return them as values of functions. For example, our `sigma' function is conceptually bound to the notion of addition and an initial value of zero. Suppose we made those both parameters? int reduce(int start, int accum(int, int))(int *, int) { int f(int *a, int n) { int ac = start; int i; for (i = 0; i < n; i++) acc = accum(acc, a[i]); return acc; } return f; } We now have a function called `reduce' that takes a starting value and an accumulation function and returns a function that operates on an integer array. One might use it this way: int (*sigma)(int*, int) = reduce(0, int(int x, int y) { return x + y; }); int (*pi)(int*, int) = reduce(1, int(int x, int y) { return x * y; }); Here, then we use the function `reduce' to create the two functions we subsequently refer to as sigma and pi. People who have seen this style or programming before probably realize how much expressive power there is to be had thereby. Unfortunately, implementing it requies powerful run-time mechanisms -- probably the equivalent of a general-purpose garbage collector -- and is therefore very hard to add to C because of C's unrestricted use of pointers. The third problem is that even if it were possible to implement these extensions to C with no problems, it wouldn't do me any good unless all my potential customers had access to those implementations. It doesn't matter how powerful a language is if my programs can't be used by the people who have to use them for them to be useful. If I am programming only for myself, I can use any tools I have at my disposal. But if I'm trying to write stuff for others to use, I must either use tools that are compatible with theirs or make my tools available to them. This problem, and the necessity for its solution, often far outweighs any technical argumens about the tools in question. -- --Andrew Koenig ark@europa.att.com