Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site csd1.UUCP Path: utzoo!linus!decvax!harpo!gummo!whuxlb!floyd!cmcl2!csd1!condict From: condict@csd1.UUCP (Michael Condict) Newsgroups: net.lang Subject: Back to I/O operations in programming languages Message-ID: <114@csd1.UUCP> Date: Wed, 31-Aug-83 12:37:35 EDT Article-I.D.: csd1.114 Posted: Wed Aug 31 12:37:35 1983 Date-Received: Thu, 1-Sep-83 07:06:48 EDT References: <425@houxq.UUCP> Organization: New York University Lines: 108 Thank you Hal, for a note that addresses the original question of whether it is desirable to merge files with variables. (I'm sorry, everybody, that it took me so long to understand that C doesn't have I/O. I won't ever say it does again, I promise. I understand the difference now between the Language and the Library Routines, may they never be confused. Amen. I still don't understand why ``main(){printf("abc");}'' is in the set of strings that, when supplied to any C processor, produce a predictable result, whereas ``main(){glerp("abc");}'' is not, since both strings - or neither? - are in the C Language. But don't you worry, I'll figure it out; I'm pretty smart.) Hal points out a serious difficulty with the original proposal, which he actually considered installing in Pascal 8000, namely the adverse impact it would have on the efficiency of access to variables that are in main storage. I certainly agree that we don't want to do anything that would impinge on the efficiency of non-I/O operations. For those who missed the note, the point was that when compiling a procedure that takes an array parameter, and especially when compiling it separately from the rest of the program, we don't know whether it will be passed a main-storage array or an array that is really a file. So apparently we have to code up this infor- mation with the actual parameter and check it with every access of the parameter inside the procedure. I can think of one cheater's way out of this. Simply declare that the "slow" attribute (which says that a variable is to be stored on a file) makes a var incompatible with non-slow vars that are otherwise declared identically, at least for the purposes of actual/formal parameter compatibility. This is in fact exactly what is done for the "packed" attribute, as I recall. Correct me if I'm wrong, Hal, but isn't it true that a packed array may not be passed to a procedure that expects an unpacked array? Or is it that an element of a packed array may not be passed? In any case, there is precedent for this approach, and it doesn't by any means give up all the advantages of the original idea. We can still compare or copy any two files with a single operation instead of a loop, we can still declare files to have a structure that is appropriate to the problem and we can change our program to have it stop using a temporary file (when we move to a virtual memory machine or one with a larger address space) by removing a single word from a variable declaration. In fact the only thing we lose is the ability to read an entire file into a main-storage variable with one assignment statement. And this restriction need only be imposed across procedure boundaries because, in other contexts, when we encounter the statement A := B;, we know whether or not one or both of A and B are "slow" variables and can generate the appropriate code. It is clear that when neither one is "slow" we can generate the exact same code we do now, so the presence of this possibility does not adversely affect the efficiency of unrelated parts of the language. Another issue I would like to see discussed is how to do I/O in a language that is totally applicative (or functional, if you prefer to restrict your applicative programs to second-order combinators, as Backus proposes). Has anyone written a beautiful, utterly applicative program in LISP only to discover that, in order to get the input into it or the results printed it is necessary to call READ or PRINT, procedures with side effects? Must we resort to the heavy artillery of lazy evaluation (with its implications for the entire language) in order to achieve the relatively modest goal of applicative I/O? To phrase the question in concrete terms, here is what I want in LISP: I want there to be (at least) two predefined variables, INPUT and OUTPUT that are bound to standard input and output. These are, in the simplest case, ordinary list variables to the naked eye. However, because they are bound to the terminal, accessing elements of these lists may cause I/O to happen. Specifically, whenever CAR/CDR combinations are used to access an element of the INPUT variable that has not yet been accessed (is a virgin, let us say) then this element, preceded by any virgin elements before it must immediately be read from standard input. By this means, the program sees input as consisting of a pre-existing sequence of LISP values, not unlike the case in PASCAL. Totally applicative programs that input values are now possible. The problem is how to avoid slowing down the execution of CAR and CDR for values other than the input list. For the OUTPUT variable, things are more complicated. I cannot see, in detail, a reasonable way to associate actual output with this variable. Let me just say that whenever execution of the (applicative) program reaches a point where we know the first N elements of the list that is to be the value of this variable, we should make sure that all N of these elements are immediately output, if they have not yet been. The problems with this statement are several. First, what if our applicative language has non-applicative features (so that "real" programmers will use it). Then, presumably, the value of the OUTPUT variable, after being almost completely constructed, can be changed radically using, say, RPLACA and RPLACD. So when do we know the final value of the first N elements of OUTPUT? Must we wait till the end of the program before outputting anything? The second problem is more fundamental. Even if the language is totally applicative, what is the nature and scope of the OUTPUT variable? If it is global, we cannot change or define its value by any function calls, and if it is a local variable of some function or some construct of the form let OUTPUT be x in e(OUTPUT) then which function or construct does it belong to and why? We could say that whenever a variable with the name OUTPUT is bound anywhere in the program, that value is immediately printed, but this is the SNOBOL approach to output, is incompatible with the spirit of applicative programming and is not analogous to the treatment proposed for the INPUT variable. Another approach is to be dogmatic and say that an applicative program consists entirely of the evaluation of a (large) expression, so clearly, the only output consists of printing the value of this expression. This is not practical at all for programs that must interact with a user, unless we output portions of the value of the expression while it is still being computed. Even so, it does not address the question of how to interface applicative programs to the real world, in which the execution of a program might be required to change the contents of several pre-existing files, or make new ones. For these cases, I think we need a way of connecting variables to files, but the same problems described above for the OUTPUT variable apply. Surely you applicativists out there have some way out of this mess, don't you? Michael Condict ...!cmcl2!csd1!condict Courant Inst., N.Y.U.