Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!usc!ucsd!sdcc6!ir230 From: ir230@sdcc6.ucsd.edu (john wavrik) Newsgroups: comp.lang.forth Subject: Essence of Forth Message-ID: <7414@sdcc6.ucsd.edu> Date: 15 Feb 90 07:59:23 GMT Organization: University of California, San Diego Lines: 293 Remark: This thread was originally started by Doug Philips with questions about why the Forth community does not regard Postscript as Forth. Philips now says that his intent was to clarify his understanding of "the Essence of Forth" -- I am renaming the thread in accordance with this intent. ------------- Mitch Bradley makes some comments as a contribution to the discussion about "Why Postscript is not Forth". The quotes below marked # are from his article 2226. I. When a computer scientist talks about the formal properties of Forth there is always a tendency for users to read in negative connotations which may, or may not, be either intended or even important. There is an air about the way comments are made that give them weight. What Forth programmer does not shrink in shame to learn that 'D' (the successor to 'C' being heavily pushed by by Yamaha which bought out AT&T) has orthogonally tangential control structures, while Forth does not? Computer scientists have as sure a belief that the power of a computer language comes from an accumulation of formal properties as programmers do that it comes from subjective factors: how the language "feels" and how expressive it is. Forth (and other semi-compiled or interpreted languages) seems to be called to task for the lack of attributes which are more appropriate in the world of compiled languages. Since so few languages provide an interactive environment, there is apparently little recognition of the fact that to get a language to run interactively and efficiently requires a different set of implementation decisions than a language designed for pure compilation. One case in point is typing of data objects (primarily important for error trapping and for the overloading of operator symbols). Some find it objectionable that a Forth system has several names for addition operations: + and D+ together with, perhaps, F+ in a basic system and then $+ for concatenation of strings, and, if you are doing math programming, perhaps also a MD+ for addition mod p, MAT+ for matrices, P+ for polynomials. Compiled languages allow all addition-like operations to be performed by the same symbol + because the decision of which + to perform can be made by the compiler. In an interpreted language, where the decision usually cannot be made until run time (experiments with multiple CFAs notwithstanding), a choice must be made as to whether or not to overload symbols. As Mitch Bradley points out, Postscript chooses to type its objects and allow overloading of symbols -- which results in substantial run time penalties: # 1) Operator overloading. Many of PostScript's basic operators operate # on multiple data types. The operator (e.g. "add") must test its operands # at run time and decide what how to handle them. In some cases, it # may be possible to optimize some of this at compile time, but the # compiler's ability to do so is somewhat compromised by the default # late binding (see 3), and the "visibility" of innards of the compiled # procedure. APL also does this -- but the overloading is restricted to system- defined data types. Interpreted BASIC (dare I mention it) also does this. Overloading of operator symbols requires typing of objects. The idea that it is desirable is inherited from our experience with compiled languages, where it does not exact a runtime penalty. The use of overloading in interactive languages exacts a substantial runtime penalty -- it is one of the reasons that people believe that interactive languages must be slow. Part of the examination of the "essence of Forth" should involve looking at the distinction between compiled and interpreted languages. One must ask questions like "Is it really better or easier to write programs if one is allowed to confuse several conceptually different operations by giving them the same name?" [Since, in traditionally implemented Forth, the children of the same defining word have the same address in their code field, some typing information is available for those who want to experiment. My own conclusion is that overloading symbols does not facilitate the programming I do -- and certainly does not justify the runtime penalty.] P.S. In a private communication, Mitch Bradley has reminded me that many of the things I have attributed to Charles Moore actually should be credited to the FIG team that first defined the model that popularized Forth. So let me acknowledge the wisdom of Bill Ragsdale, John James, Kim Harris, and others who were part of this effort. Later in this article, Mitch Bradley will say "I claim that 'Forth- like' means 'ad-hoc' and I will respond by saying "It is ad hoc in the best sense -- it is designed to produce the best practical solution to the problem at hand". The traditional Forth approach to overloading symbols is not to do it (because of runtime penalties) but to provide the means if it is deemed necessary. The same applies to range checking of arrays, having functions check the number and types of their arguments, etc. I think part of the "essence of Forth" is to provide tools to build appropriate features rather than provide features which may or may not be appropriate. II. Bradley, in describing Postscript vs Forth, says: # As stated above, the visibility of the stack is a similarity, and the # fact that the stack elements are typed abstract objects in PostScript # and untyped bit patterns in Forth is a fundamental difference. I think this is misleading. It's like saying that your checkbook is an organized collection of accounts and debits while the arithmetic calculations on the paper next to me is a collection of curves and lines made with a pencil. It is quite possible to program in Forth in such a way that the programmer visualizes the objects on the stack as abstract objects: I have two strings on the stack -- I SWAP them, then take the left three characters of the string on top of the stack and concatenate the two. (The fact that what is really on the stack are untyped bit patterns is incidental -- the fact that these bit patterns are pointers to data which occupy a segment of memory provided by a dynamic allocation scheme was something I thought about when I wrote the string package -- the fact that the end result is that there are STRINGS on the STACK which I can manipulate with consistent Forth semantics is what is important to me as a programmer.) Can a programmer in Forth think about stack elements as if they were abstract objects? -- YES. Can a Forth programmer think about stack elements as other things? -- YES. Does the language force the Forth programmer to think of stack elements as untyped bit patterns? -- NO. III. # Obviously, this depends on a subjective judgement of what "Forth-like" # means. I claim that "Forth-like" means "ad-hoc". Here I strongly agree. Forth was designed pragmatically rather than as an embodyment of someone's theory of programming languages. It was designed by a programmer to solve programming problems. It is ad hoc in the best sense -- it is designed to produce the best practical solution to the problem at hand. # Forth is rarely # consistent about anything. The naming is inconsistent, the syntax is # reverse polish except when it isn't, there are 4 different kinds of # strings, there are some obvious comparison operators that just happen # to be missing. Here I disagree. I think the semantics involved in Forth are fairly consistent. Some of what has been posted recently regarding the supposed dual PRE/POST-fix nature of Forth syntax stems from a desire to have undeclared strings be a fundamental data type. If undeclared strings are not a fundamental type then they are (consistently) preceeded by a handler. If I treat horses differently than chickens, I'm only inconsistent to people who think I should treat all animals the same way. To the best of my knowledge, I have never treated a chicken like a horse. (Sometimes I do treat horses like chickens -- but that's just my "ad hoc" nature 8-) # No wonder Forth doesn't have a good standard string package; nobody # can figure out which string representation to use. Besides which, # the lack of dynamic memory allocation facilities makes it pretty hard # to figure out where to put them. Forth is fairly consistent in storing its strings as counted strings. (I don't think that null-terminated, for example, is ever used when treating the string as a unit -- although the Forth-79 EXPECT produced a null-terminated string.) The fact that a string be treated as a unit (e.g. can be pointed to by a single address) is important in allowing standard stack manipulation words to be used. The fact that TYPE uses an address and count extends its usefulness, COUNT should not be thought of as producing a different representation, just as a way of generating parameters for a word that can also be used for other things. An examination of the limited string manipulation which takes place within the Forth system ignores the mechanisms which can and have been sucessfully used by Forth programmers to added string handling capability to their systems. Forth does not contain a string package. Forth supports a variety of string packages which can be added on. III. I think that one of my greatest problems is dealing with the view "if it isn't built in to the language it doesn't exist". I keep finding people who tell me that "you can't do strings in Forth" (in spite of the fact that MMS sells a commercial quality wordprocessor and a database system written in Forth and I have somehow managed to use strings in my own work [I started before I knew I wasn't supposed to be able to do it]). There are a wide range of string-handling capabilities that someone might want in Forth ranging from none to substantial. A moderately comprehensive string package can be added to Forth using about 4 screens. If you want to treat the strings in the same way as other objects, you need a storage management scheme taking another 4 screens. This will be enough to meet the needs of about 85% of Forth programmers (and does not deny the validity of the needs of the other 15% who either need no strings on one hand, or a full-blown memory allocation with garbage collection system on the other). The nature of Forth permits features to be added-on which would, in other languages, need to be built-in. It is silly, both in thought and standardization efforts, to treat Forth as a rigid language. Part of the "essence" is its ability to accept features as "changeable parts". IV. In response to Bradley's listing of 4 different string types: It is quite unlikely that people reject Forth because they find that, somewhere in its innards, that an implementation team found it useful to tag the ends of the name field (thereby producing your example 4 of inconsistent ways strings have been handled). I think it is more likely that people reject (or leave) Forth because it has been rendered non-portable. Not because of genuine hardware problems, but by the unwillingness of vendors to realize that portability is more important than exercising their eccentricities. If you want a string package in Forth, the best first step would be to have the Forth community get its act together and make the language highly portable again. This would allow people to produce a variety of string packages (among other useful things) that will run on everyone's Forth [and therefore have a larger market]. V. # (BTW, I am pretty sick of all this "Zen philosophy" nonsense (weakness # is strength, dah-dah, dah-dah) regarding Forth; we are talking about a # programming language that deals with physically real hardware, whose # success or failure is ultimately determined by economics, measured in # real money. We are not talking about a way in which to live your life # and interact with your fellow man and achieve spiritual salvation.) I somewhat agree with Bradley to the extent that I have discovered that no one is really interested in the religious experiences that you have had with your programming language. This, however, should not deny the fact that the most important thing about a programming language is the way programmers feel about using it. We may be talking about "physically real hardware" but we are also talking about "physically real programmers". I'd like to discourage some people before I go on: To those of you who think you will make megabucks by exploiting Forth: IT WON'T HAPPEN -- GO FIND ANOTHER LANGUAGE. This being said, we now are rid of all the people who want to add to Forth inappropriate characteristics because it will sell (regardless of whether it is healthy for the language). We can talk about "essence" where it counts: what is a language like when you use it to write programs. There are, in fact, subjective characteristics of a programming language which are an important part of the man-machine interface. These are no less important because we do not know how to articulate them. I am impressed if someone tells me of a language by saying: "It's the only language where I feel I can do anything I want -- I can conceive of something and carry it out, and not be limited by what the language lets me do". I'm less impressed by someone who tells me "Its control structures are orthogonally tangential". I guess that's because I'm a programmer and not a computer scientist. A group of people do not sit down together and say "let's put together a language that feels like Zen Buddism". Instead, a group of people are confronted with a language -- and they find that it feels so different that they can only describe the experience in religious terms. If someone becomes involved with Forth and does not have this sense about it, it may be because we have managed to remove from Forth its essence. If you do have this sense about Forth, I agree with Mitch that you probably will not help the cause by beating other people over the head with it. You can't spread a religion by talking about it -- you have to put other people in a position to experience it for themselves. Of course the Forth community is not taking the steps to have others experience it for themselves. It is not taking the steps needed to have the language taught (or even teachable). It is not making it possible for those likely to teach the language to use if for their work (which is the prelude for using it in teaching). This is a far reaching issue -- it has to do with many things (none of which involves the fact that name fields are stored in the dictionary with high bits set at both ends). John J Wavrik jjwavrik@ucsd.edu Dept of Math C-012 Univ of Calif - San Diego La Jolla, CA 92093 P.S. The Forth community may owe apology to people who are genuine Zen Buddists -- I don't think the analogy comes from a deep study of Zen. Basically it is an expression that simple things can be amazingly powerful and complex things can be very weak.