Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!cs.utexas.edu!usc!ucsd!hub.ucsb.edu!eiffel!bertrand From: bertrand@eiffel.UUCP (Bertrand Meyer) Newsgroups: comp.lang.eiffel Subject: Re: Stating typing in Eiffel Summary: Reposting of July 1989 message. Part 2 of 2. Message-ID: <386@eiffel.UUCP> Date: 29 Jul 90 19:55:15 GMT References: <384@eiffel.UUCP> <385@eiffel.UUCP> <1990Jul26.221405.23439@Neon.Stanford.EDU> <382@eiffel.UUCP> <1990Jul27.163928.5886@Neon.Stanford.EDU> <264@eiffel.UUCP> <259@eiffel.UUCP> <259@eiffel.UUCP Organization: Interactive Software Engineering, Santa Barbara CA Lines: 520 STATIC TYPING FOR EIFFEL Bertrand Meyer January 1989 (updated, July 1989) (Minor revisions, July 1990) ****** POSTED IN TWO PARTS; THIS IS POSTING 2 ******** *** Please prepend posting 1 *** 5 THE TYPE RULE The type rule will be introduced under a number of simplifying assumptions, meant to keep the discussion clear. 5.1 Simplifications An Eiffel type will be taken to be defined by a class. This includes in fact two simplifications: + ``Expanded types'', which in particular include the basic types INTEGER, REAL etc., are not covered. These, however, are not involved in polymorphism, and hence do not raise any of the problems discussed above. + Generically parameterized types are also not covered. Genericity, essential in practical uses of Eiffel, does not fundamentally affect the current discussion. A consequence of this simplification is that ``B conforms to A'' is the same relation as ``B is a descendant of A''. For the purpose of this discussion, Eiffel may be viewed as having only two primitives: assignment and feature application. An assignment is of the form x := y and is a reference assignment. The left-hand side is an entity; the right-hand side is an expression. A feature application is either ``remote'' or ``local'', corresponding to the two forms x.f (a, b, ...) f (a, b, ...) where feature f may be an attribute or a routine. All feature applications will be considered to be remote applications of routines with exactly one argument, of the form: x.r (a) meaning ``Apply routine r to the object associated with x, if any, with the given arguments''. This simplification does not entail any loss of generality: + For local feature applications, we can consider that x is Current, the predefined entity representing the current object. + A routine with no argument may always be rewritten as having one argument, not used in the body. + For a feature f which is an attribute, we can write an associated function r with one argument (not used), returning the value of the attribute, and replace all applications of f by applications of r. + For a routine with more than one argument, the rules on a as given below are easily extended to rules applying to every formal argument and the corresponding actual arguments. The above does not consider ``multidot'' feature applications of the form a.b.c.d.... Although useful in practice, they can safely be ignored from a theoretical viewpoint since they can always be replaced by sequences of one-dot applications by using auxiliary variables (x := a.b; y := x.c; etc.) More generally, we may consider that every expression (usable as source for an assignment or actual argument to a routine) is either an entity or a feature application of the above form, where r must be a function. As a last simplification, we assume that no renaming is used, so that a feature defined in a class has the same name in all descendant classes, including those where it is redefined. This entails no loss of generality since renaming is a purely syntactic facility, albeit a very important one in practice. The only case for which renaming is indispensable is removing name clashes in multiple inheritance; for the purpose of this discussion we can assume that the original classes are rewritten so as to avoid any such case. In the same spirit we assume that when a routine is redefined the name of its formal argument (whose type may be redefined according to the covariant rule) does not change. 5.2 The goal of typing The preceding simplifications make it possible to express the goal of Eiffel typing in a complete yet simple way: _________________________________________________ Type safety constraint: An Eiffel system is said to be type-safe if for any routine application x.r (a), and for any dynamic types DX and DA that x and a may take during the execution of the system, DX includes and exports a routine corresponding to r, with a formal argument declared of a type which is an ancestor of DA. _________________________________________________ A system satisfying this constraint cannot suffer type failures at run time. It can still, however, suffer other failures, most notably by attempting to apply a routine r to a reference x that happens to be void. But this, as the division by zero case discussed at the beginning of this article, is not within the scope of typing strategies. Clearly, the contravariant rule, combined with either of policies 1 and 2 for export and inheritance, guarantees the type safety constraint. We must now see how to ensure it with the more liberal rules: policy 3 and the covariant rule. 5.3 The type rule The fundamental type rule will address the preceding requirements; it replaces (by strengthening it) the first rule of section 3.2. The rule uses the set Aliases (x) which, for any entity x in a system, contains all entities to which x may become aliased through assignment or routine call. A precise definition of this set will be given below. The type rule is the following: _________________________________________________ Type rule: An Eiffel system is type-correct if and only if the following two conditions are satisfied for any routine application x.r (z): + For every y in Aliases (x), Sy exports r. + For every y in Aliases (x) and every t in Aliases (z), St conforms to the type of the formal argument of r in Sy . _________________________________________________ Pending a proper definition of Aliases, it is clear that this rule will satisfy the above type constraint. 5.4 Aliasing It remains to define Aliases (x) precisely. A more correct notation would be Aliases [S] (x) denoting the possible aliases of x with respect to a given system (set of classes) S. In what follows S will be considered to be fixed and the subscript will be dropped. Repeating the definition of a few basic concepts will be useful. + An entity is a name declared in a class, representing a reference which may be assigned values at run-time. Formally, then, an entity of name x appearing in class C is defined by the pair . For simplicity, entities will in fact be designated below given by their entity names only, but we must be careful to make it clear, for every x, what class C it belongs to. C will be called the class of x + Any entity x is declared of a certain type, called the static type of x and written Sx. The declaration may appear in the class of the entity or, for an inherited entity, in an ancestor. + At any time during execution, the type of the object associated with a non-void entity is called the dynamic type of the entity and written Dx. The second basic type rule (section 3.2) guarantees that the dynamic type conforms to the static type. + The definition of entities includes the predefined entity Result, denoting the result of a function. To resolve any ambiguity, the function name will be used in this case as name for the entity. Aliases is introduced through two binary relations. First entity y is said to be assignable to entity x if and only if any of the following assignments appears in the system: A.1 + x := y A.2 + x := b.y (t) for any b and t. In the second case, the relation holds not only for the entity y declared in Sb but also to the corresponding entities in all the descendants of this class. Next, y is said to be bindable to x if and only if one of the following calls appears in S: B.1 + a.r (y) B.2 + a.r (b.y(t)) In both cases the relation holds whenever x is the formal argument for r in any descendant of Sa; in the second case the relation extends to all y in any descendant of class Sb. We say that y is a direct alias for x if y is assignable to x or bindable to x. Aliases (x) may now be defined by transitive reflexive closure: for any entity x, Aliases (x) is the set containing x itself and all entities y such that, for some z, y is a direct alias for z and (recursively) z belongs to Aliases (x). The above definitions consider all assignments and routine calls present in the system regardless of whether they may or may not be executed in actual runs of the system. So no flow analysis will be needed to enforce the type rule; as announced in 4.1, this will mean a pessimistic policy, although a less pessimistic one than policies 1 and 2. (The inclusion of all descendants of Sa and Sb in the above rules is also pessimistic; the aim here was to keep the definitions simple at no significant loss of useful systems. The interested reader may wish to examine how these conditions can be relaxed.) 6 IMPLEMENTING THE RULE [This section is only relevant for readers who are interested in the details of the incremental verification mechanism. Others are invited to proceed to the last section.] As introduced so far, the computation of Aliases (x) for every entity of a system does not take into account the order of compilation. As discussed in section 2.4, an essential requirement is the ability for the automatic incremental compilation mechanism to check a new class without re-checking the classes on which it depends - its suppliers and ancestors - if they have already been checked. Although the Eiffel incremental recompilation algorithm is not in the public domain, we may outline the strategy here. The first observation is that case B.2, which seems the most formidable one, can safely be ignored. By conceptually introducing a new entity z we can rewrite the corresponding call as z := b.y(t); a.r (z) making B.2 a combination of A.2 and B.1. The incremental compilation requirement precludes the straightforward implementation of type checking, whereby we would compute Aliases (x) for every entity x, and then examine every feature call x.f (z) to see if it satisfies the type rule. True, this is appropriate for A.2: whenever this case yields a new direct alias y for x, then the class of x is a client of the class of y, so that the set Aliases (x) may be updated in an order compatible with the compilation order. But this does not work for case B.1, where it is in fact the class of y which is a client of the class of x. (In case A.1 x and y belong to the same class.) As a consequence we must adopt a mixed strategy; the sets to be examined must be computable and adaptable in an order compatible with the dependency relation (and hence the compilation order). For every entity x, these sets will be: + Remote_aliases (x), the set of y for which A.2 holds; for convenience, this set also contains x. + Required_exports (x), the set of routines r which must be exported by the class of x because the system contains a call z.f (t) for some z such that x is in Aliases (z). + Required_types (x), the set of types C to which the Sx must conform because the system contains a call z.f (t), where f requires a formal argument of type C, for some z such that x is in Aliases (z). The strategy to follow conceptually involves four steps for each new class. Step 1 initializes the above sets for each entity x of the class: the first set is initialized to the set containing just x itself; the second, to the set of f such that x.f (...) appears in the class; the last, to the types of the corresponding arguments. Step 2 propagates aliasing information to descendants. Whenever a new class HEIR is defined as heir to an existing one PARENT, step 2 performs the following for every entity x of HEIR (where x' is the corresponding entity in PARENT): + Remote_aliases (x) := Remote_aliases (x) U Remote_aliases (x') + Required_exports (x) := Required_exports (x) U Required_exports (x') + Required_types (x) := Required_types (x) Required_types (x') This step implements the rule given in the previous section for both the ``assignable'' and ``bindable'' relations: any alias of an entity of class PARENT is considered to be also an alias of the corresponding entities in descendants of PARENT. For each new class, step 3 will perform the incremental computation of the three above sets and step 4 will verify that these sets satisfy the type rule. As will be seen below, these two steps may be performed simultaneously on each class. Step 3, when encountering any of the first three types of potential aliasing introduced above, must update the sets as follows: A.1 + Required_exports (y) := Required_exports (y) U Required_exports (x) Required_types (y) := Required_types (y) Required_types (x) A.2 + Remote_aliases (x) := Remote_aliases (x) U Remote_aliases (y) B.1 + Required_exports (y) := Required_exports (y) U Required_exports (x) Required_types (y) := Required_types (y) Required_types (x) In each of these cases the updating has been chosen so that it only affects some sets A (x) for an entity x in the new class, not in its (possibly old) suppliers. Using the above updates, the given sets are computed incrementally in an order which is compatible with inter-class dependencies, so that the computation may be integrated with the automatic compiling mechanism. Once this has been done, step 4 establishes the type rule for every new class C by checking the following two properties for every entity x in C: P1 + Every routine r in Required_exports (y), for every y in Remote_aliases (x), is exported by C. P2 + Sa belongs to Required_types (y) for all y in Remote_aliases (x). In practice, step 4 may be performed at the same time as step 3 on each class C. Property P1 may be verified incrementally by checking in cases A.1 and B.1 that r belongs to Required_exports (y). P2 is verified incrementally by checking in cases A.1 and B.1 that Sa belongs to Required_types (y). This incremental checking assumes that some initial verification has been integrated with steps 1 and 2. The details are easily filled. It is interesting to note that this initialization of the complete checking process is precisely the set of verifications performed by the current Eiffel compiler, reflecting the first and second rules described in section 3.2. As this discussion as shown, the full checking strategy is an extension to this partial approach. 7 ASSESSMENT To conclude, it is useful to examine a few possible objections to the rules and enforcement strategy described above. A first possible objection is that the algorithm sketched in the previous section is too difficult to integrate with the current Eiffel compiler. Of course, the only way to refute this objection would be to present the implementation, which is still some time away. The objection can only be made, however, by someone who does not realize the sophistication of some of the tasks already performed by the current Eiffel compiler (for example the implementation of repeated inheritance). A related objection addresses the cost of the new checking, especially in space. The precise impact needs to be assessed more carefully, but I believe the checking can reasonably be integrated within the normal compiling mechanism. If not, the mechanism would still be available as a compiler option a` la Lint, to be applied to systems at validation time. Perhaps the most serious objection has to do with the relative complexity of the mechanism described in the preceding section. But this mechanism is of no interest to normal users of the language; only implementors (and others doubtful of the soundness and enforceability of the type rules) need to understand it. Eiffel programmers should only learn the type rule of section 5.3. This rule is only slightly more complex than the old type rule (first rule of section 3.2), which implies it; the advantage of the new rule should be clear to programmers reading a simplified form of the ``rationale'' section of this article (4). An attractive property of the incremental strategy described in the last section is that it appears to make it possible for the checker to produce clear and precise error messages, of the form ``In line m, feature f is applied to x; however, because of the assignment on line n, x might dynamically be of type D, which does not export f.'' If, as seems to be the case, the Eiffel compiler can reach this level of sophistication, then we will have achieved our goal. The purpose of a typed language is the same as the purpose of the programming environment of which it is a part: to help programmers, not stand in their way. The sole purpose of type restrictions is to boost programmer productivity by enabling an automatic checker to catch errors that would be difficult for a human being to detect. The liberal typing policy designed for Eiffel (``policy 3'') was designed to make programmers' life easier. The type failures to which it may lead are results of abnormal situations. Detecting such situations statically is a tricky and tedious business; as any such business, it should be handled by computers, not by humans. Acknowledgments I am deeply indebted to all the people who contributed to the type debate, including those with whose conclusions I disagree, for encouraging me to clarify the matter of types in Eiffel. As far as I can remember, Mike Ladner from Schmidt Associates was the first to point out the problems clearly, in private discussions with me. Apart from Philippe Elinck's student thesis, Pierre America was the first to discuss them in print (in his review of my book), and his article, together with William Cook's article, may still be the best exposition of the ``opposing'' view. I also thank Ralph Johnson for sharing with me (in private mail) some useful insights gained from his experience with Typed Smalltalk. [1] Pierre America, "Review of `Object-Oriented Software Construction'", Science of Computer Programming, 1989. [2] Luca Cardelli, "A Semantics of Multiple Inheritance", in Semantics of Data Types, ed. Gilles Kahn, David B. McQueen and Gordon Plotkin, pp. 51-67, Springer-Verlag, Berlin-New York, 1984. [3] William Cook, "A Proposal for Making Eiffel Type-Safe", ECOOP Conference, 1989. [4] Philippe Elinck, "De la Conception-Programmation par Objets", Me'moire de Licence en Informatique (BS Thesis), Universite' Libre de Bruxelles, Brussels, 1988. [5] Bertrand Meyer, Object-Oriented Software Construction, Prentice-Hall, 1988. [6] Bertrand Meyer, "Eiffel: The Language", Technical Report TR-EI-17/RM, Interactive Software Engineering Inc., Santa Barbara (Calif.), August 1989. (To be published by Prentice-Hall in 1990.) [7] Bertrand Meyer, An Introduction to the Theory of Programming Languages, Prentice-Hall, 1989. To appear. [8] Bertrand Meyer, "Eiffel: An Introduction", Technical Report TR-EI-3/GI, Interactive Software Engineering Inc., Santa Barbara (Calif.), 1986 (Last revision: 1990). [9] Craig Schaffert, Topher Cooper, Bruce Bullis, Mike Kilian and Carrie Wilpolt, "An Introduction to Trellis-Owl", in OOPSLA '86 Conference Proceedings, Portland (Oreg.), Sept. 29-Oct. 2, 1986, pp. 9-16, 1986. (Published as SIGPLAN Notices, 21, 11, Nov. 1986.) -- -- Bertrand Meyer bertrand@eiffel.com