Path: utzoo!censor!geac!torsqnt!news-server.csri.toronto.edu!clyde.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!usc!wuarchive!emory!hubcap!Steven From: zenith@ensmp.fr (Steven Ericsson-Zenith) Newsgroups: comp.parallel Subject: Heretics and Infidels(Part 1): Zenith replys to Gelernter. Message-ID: <12482@hubcap.clemson.edu> Date: 4 Jan 91 18:18:18 GMT Sender: fpst@hubcap.clemson.edu Lines: 185 Approved: parallel@hubcap.clemson.edu In-Reply-To: Steve Stevenson's message of Wed, 19 Dec 90 13:53:17 -0500 <9012191853.AA24710@hubcap.clemson.edu> A number of people, I know, are awaiting this posting which is essentially in reply to the statements made by David Gelernter in a posting here on the 19th of December. I know that a number of points have been raised by others in response to my original, they are not ignored - primarily, most were (I believe) misunderstandings due to the brevity with which, in this forum, a case must be made. In many respects this reply suffers the same inadequacy even though it is LONG. If things remain unclear after this post perhaps we could have a discussion offline and I will summarize for the list later. An original questioner (hshi@maytag.waterloo.edu) wrote to ask (I paraphrase): "if there is some parallel programming programming model between shared variables and message passing, and if there is such a model would it be more suitable for present computer architectures." Paolo Ciancarini wrote (Dec 13th) in reply that he believed Linda's Generative Communication through tuple space constituted a further "different model". I disagreed with Paolo's claim for Linda's distinction by noting fundamental similarities between all three paradigms where: "... the persistent nature of the intermediary objects [used for data sharing between processes] significant disinquishing characteristic.". Thus I generalized by encompassing shared variables, message passing objects (channels, mailboxes), and Linda's tuples as components which distinguish "variants of the same model". Again, these components express how *processes* share data. A *process*, for the purposes of this discussion, is a sequence whose interaction with other *processes* is completely described by some "Process Interaction Model". A "Process Interaction Model", in addition to the aforementioned "intermediary objects" includes whatever other mechanism you may have for *process synchronization*. This mechanism includes the synchronization characteristics of the operations (e.g. "in", "rd" etc. in Linda) acting upon the mentioned "intermediary objects" and any explicit constructions for process synchronization (e.g. || in CSP), or other constructions effecting the synchronization characteristics of operations (such as choice in CSP). I then said: "It would seem desirable to have all process interactions defined by the associated "interaction model"" and pointed out that "as things stand in Linda processes are able to side-effect each other directly, i.e. outside of the tuple space model -- at least this is the case for the Yale C-Linda model. The process model in Linda is vague to say the least, with the semantics of process creation varying according to implementation." This prompted David Gelernter to say: False. The semantics of process creation are described in the "How to" book, in exactly the same way they've been described in the literature for years. Zenith is correct in stating that there are Linda implementations (not ours) that don't follow our version of the semantics. But that's not to say that the definition doesn't exist, is it? Well, I'm very familar with this literature and have, in a small way, contributed to it. I and I suspect many others, haven't yet read David's book since it was only published in the past few months (but it was worth the plug anyway Dave :-) ). So I do not speak with regard to that particular publication. My observation is this: the fact that (I hesitate to say "all") other implementations do not follow the version of the "definition" in the literature is testimony to the vagueness of the "definition". Not only that, I quote from the literature in which the semantics of process creation have supposedly been described "in exactly the same way" "for years". First let us consider Nick Carriero's thesis "The implementation of Tuple Space machines" published in December 1987, in which (on page 4) he says: "There are significant unanswered questions about *eval*. Some of the most important concern variable bindings in [a process]. Can [a process] have unbound variables? If it can, when will these variables be bound? At the time *eval* executes or at the time [the process] executes?" And in the most comprehensive and least dogmatic work of it's kind on Linda I've seen todate, Jerry Leichter's thesis published in July 1989 devotes a section to "The problem of eval". In it he says (pages 27-28): "However, producing an exact specification for [eval] that is both natural within the C environment, and implementable with reasonable efficiency, is difficult," He then in subsections deals with the issue of closures and proposes a semantics for eval (page 31). They're too long to repeat them here. But to summarize, the rules attempt to guarantee that the created processes will not side effect, with clear statements about the treatment of pointers. He at least makes statements like: "In no case is any information passed back from the execution environment to the creating environment, except by explicit tuple space operations." Which I take to mean that free variables cannot be changed by any of the many things (including pointers) C uses to change the value of a variable. Leichter later states when discussing "Alternative approaches to eval", (page 279) that: "The eval operation does not fit comfortably into C, nor into most sequential languages." In a much more recent and advanced piece of work "Semantics and Analysis of First Class Tuple Spaces" published April 1990, Suresh Jagannathan expertly attempts to shore up the walls of uncertainty with some formalism (at last!). I quote here his informal statement of eval semantics: "[eval] deposits a tuple consisting of [some number of] processes into [tuple space]; the ith process is responsible for computing the value of [the respective component of the tuple]. Each process evaluates within its own private environment. To achieve this, all functions applied within eval are transformed into closed combinators that reference no free variables." In fact, if we return to the original appearance of eval in the IEEE paper "Parallel programming in Linda" published in 1985, for which both Gelernter and Carriero are accredited authors, we find the following statement regarding eval: "The process that is implicitly created to do the evaluating has available to it the value of each name appearing in [the tuple] as that name was defined in the callers environment when eval was executed. (If such a name is a function or procedure name, it's value is an executable expression). This protocol amounts to an "unevaluated" variant of call-by-value. Note that it is an error to invoke, within an eval tuple, a function whose body refers to non-local variables - their values won't be available to the process that does the evaluation." Indeed, I am aware that Suresh used this statement as a basis for his work - as should be apparent. It has also been pointed out to me that: "there is .. a research report by Jim Narem [a member of the Linda Group] that attempts to provide an operational semantics of existing implementations; it described three different Yale eval implementations, each with very different semantics." I regret that I do not have a copy of and do not recall this piece of the literature. There are several other works which I've failed to mention. So to summarize my observation of the literature - the first *real* definition that I find is that produced by Suresh Jagannarthan in April of this past year which superficially concurrs with an earlier informal statement by Gelernter. However, the statement in the 1985 paper is in the context of the following statement from the paper: "Linda is essentially identical to C." In which case the treatment of pointers has been left to a now uncertain statement about names. Where Suresh's formal description explictly excludes names which are "addresses". Much of the "vagueness" to which I refered really results from comments in Carriero's thesis - which is unfortunate since it is one of the most influential documents on the implementation of Linda. The adopted Yale C-Linda product is based on this work. The real problem however, concerns how names that are pointers are treated (this won't come as much of a surprise) - only Leichter's thesis confronts this problem earlier. I can here some crys now - wait a minute! You're confusing implementation with the definition! Indeed, that is what has happened. In the face of only informal definition and contradictory "definitive" implementations, "vagueness" and uncertainty abound. It should be no surprise that when others attempted to apply the model to other host languages that the semantics of process creation were by necessity different - and thus the confusion continues. (continued in part 2) -- Steven Ericsson Zenith * email: zenith@ensmp.fr * fax: (1).64.69.47.09 | | voice: (1).64.69.48.52 Ecole Nationale Superieure des Mines de Paris 35 Rue Saint-Honore 77305 Fontainebleau Cedex France "All see beauty as beauty only because they see ugliness" LaoTzu