Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!cs.utexas.edu!usc!ucsd!hub.ucsb.edu!eiffel!bertrand
From: bertrand@eiffel.UUCP (Bertrand Meyer)
Newsgroups: comp.lang.eiffel
Subject: Re: Stating typing in Eiffel
Summary: Reposting of July 1989 message. Part 2 of 2.
Message-ID: <386@eiffel.UUCP>
Date: 29 Jul 90 19:55:15 GMT
References: <384@eiffel.UUCP> <385@eiffel.UUCP> <KAPLAN.90Jul27123145@scooby.cs.umass.edu> <1990Jul26.221405.23439@Neon.Stanford.EDU> <382@eiffel.UUCP> <1990Jul27.163928.5886@Neon.Stanford.EDU> <264@eiffel.UUCP> <259@eiffel.UUCP> <Ec.3251@cs.psu.edu> <259@eiffel.UUCP
Organization: Interactive Software Engineering, Santa Barbara CA
Lines: 520


                     STATIC TYPING FOR EIFFEL

                         Bertrand Meyer

                January 1989 (updated, July 1989)

                   (Minor revisions, July 1990)


        ****** POSTED IN TWO PARTS; THIS IS POSTING 2 ********

                  *** Please prepend posting 1 ***


5 THE TYPE RULE


The type rule will be introduced under a  number  of  simplifying
assumptions, meant to keep the discussion clear.


5.1 Simplifications


An Eiffel type will be taken to  be  defined  by  a  class.  This
includes in fact two simplifications:

     + ``Expanded types'', which in particular include the  basic
       types INTEGER, REAL etc., are not covered. These, however,
       are not involved in polymorphism, and hence do  not  raise
       any of the problems discussed above.

     + Generically parameterized  types  are  also  not  covered.
       Genericity,  essential  in  practical uses of Eiffel, does
       not  fundamentally  affect  the  current   discussion.   A
       consequence of this simplification is that ``B conforms to
       A'' is the same relation as ``B is a descendant of A''.

     For the purpose of this discussion, Eiffel may be viewed  as
having only two primitives: assignment and feature application.

     An assignment is of the form
     x := y

and is a reference assignment.  The left-hand side is an  entity;
the right-hand side is an expression.

     A feature application is  either  ``remote''  or  ``local'',
corresponding to the two forms
     x.f (a, b, ...)
     f (a, b, ...)

where feature f may be an attribute or a routine.

     All feature applications will be  considered  to  be  remote
applications of routines with exactly one argument, of the form:
     x.r (a)

meaning ``Apply routine r to the object  associated  with  x,  if
any,  with  the  given arguments''.  This simplification does not
entail any loss of generality:

     + For local feature applications, we can consider that x  is
       Current,  the  predefined  entity representing the current
       object.

     + A routine with no argument  may  always  be  rewritten  as
       having one argument, not used in the body.

     + For a feature f which is an attribute,  we  can  write  an
       associated  function  r  with  one  argument  (not  used),
       returning the value of  the  attribute,  and  replace  all
       applications of f by applications of r.

     + For a routine with more than one argument, the rules on  a
       as  given  below  are easily extended to rules applying to
       every  formal  argument  and  the   corresponding   actual
       arguments.

     The   above   does   not   consider   ``multidot''   feature
applications of the form a.b.c.d.... Although useful in practice,
they can safely be ignored from  a  theoretical  viewpoint  since
they  can always be replaced by sequences of one-dot applications
by using auxiliary  variables  (x := a.b;  y := x.c;  etc.)  More
generally,  we  may  consider  that  every  expression (usable as
source for an assignment or actual  argument  to  a  routine)  is
either  an  entity  or  a  feature application of the above form,
where r must be a function.

     As a last simplification, we  assume  that  no  renaming  is
used,  so  that a feature defined in a class has the same name in
all descendant classes, including those where  it  is  redefined.
This  entails  no  loss  of generality since renaming is a purely
syntactic facility, albeit a very important one in practice.  The
only  case  for  which renaming is indispensable is removing name
clashes  in  multiple  inheritance;  for  the  purpose  of   this
discussion  we can assume that the original classes are rewritten
so as to avoid any such case. In the same spirit we  assume  that
when  a  routine  is  redefined  the  name of its formal argument
(whose type may be redefined according  to  the  covariant  rule)
does not change.


5.2 The goal of typing


The preceding simplifications make it  possible  to  express  the
goal of Eiffel typing in a complete yet simple way:


          _________________________________________________


            Type safety constraint:  An  Eiffel  system  is
         said   to  be  type-safe  if  for  any  routine
         application x.r (a), and for any dynamic  types
         DX  and  DA  that  x  and a may take during the
         execution  of  the  system,  DX  includes   and
         exports  a  routine  corresponding to r, with a
         formal argument declared of a type which is  an
         ancestor of DA.

          _________________________________________________


     A system  satisfying  this  constraint  cannot  suffer  type
failures  at  run  time.  It  can  still,  however,  suffer other
failures, most notably by attempting to apply a routine  r  to  a
reference x that happens to be void. But this, as the division by
zero case discussed at the beginning  of  this  article,  is  not
within the scope of typing strategies.

     Clearly, the contravariant rule,  combined  with  either  of
policies  1 and 2 for export and inheritance, guarantees the type
safety constraint. We must now see how to ensure it with the more
liberal rules: policy 3 and the covariant rule.


5.3 The type rule


The  fundamental   type   rule   will   address   the   preceding
requirements; it replaces (by strengthening it) the first rule of
section 3.2. The rule uses the set  Aliases (x)  which,  for  any
entity x in a system, contains all entities to which x may become
aliased through assignment or routine call.  A precise definition
of this set will be given below.

     The type rule is the following:


           _________________________________________________


            Type rule: An Eiffel system is type-correct  if
         and  only  if  the following two conditions are
         satisfied for any routine application x.r (z):


               + For every y in Aliases (x), Sy    exports r.


               + For every y in Aliases (x) and every t  in
         Aliases (z),  St     conforms  to  the type of the
         formal argument of r in Sy   .

           _________________________________________________


     Pending a proper definition of Aliases,  it  is  clear  that
this rule will satisfy the above type constraint.


5.4 Aliasing


It remains to  define  Aliases (x)  precisely.   A  more  correct
notation would be Aliases [S] (x) denoting the possible aliases of x
with respect to a given system  (set  of  classes)  S.   In  what
follows  S  will be considered to be fixed and the subscript will
be dropped.

     Repeating the definition of a few  basic  concepts  will  be
useful.

     + An entity is a name declared in a  class,  representing  a
       reference  which  may  be  assigned  values  at  run-time.
       Formally, then, an entity of name x appearing in  class  C
       is  defined  by  the pair <C, x>. For simplicity, entities
       will in fact be designated below  given  by  their  entity
       names  only,  but we must be careful to make it clear, for
       every x, what class C it belongs to.  C will be called the
       class of x

     + Any entity x is declared of a  certain  type,  called  the
       static  type  of  x  and  written Sx.  The declaration may
       appear in the class of the entity  or,  for  an  inherited
       entity, in an ancestor.

     + At any time during  execution,  the  type  of  the  object
       associated  with  a  non-void entity is called the dynamic
       type of the entity and written Dx.  The second basic  type
       rule  (section  3.2)  guarantees  that  the  dynamic  type
       conforms to the static type.

     + The definition of entities includes the predefined  entity
       Result,  denoting the result of a function. To resolve any
       ambiguity, the function name will be used in this case  as
       name for the entity.

     Aliases is introduced through two  binary  relations.  First
entity  y is said to be assignable to entity x if and only if any
of the following assignments appears in the system:

A.1
     + x := y

A.2
     + x := b.y (t) for any b and t.

     In the second case, the relation  holds  not  only  for  the
entity y declared in Sb but also to the corresponding entities in
all the descendants of this class.

     Next, y is said to be bindable to x if and only  if  one  of
the following calls appears in S:

B.1
     + a.r (y)

B.2
     + a.r (b.y(t))

     In both cases the relation holds whenever x  is  the  formal
argument  for  r  in any descendant of Sa; in the second case the
relation extends to all y in any descendant of class Sb.

     We say that y is a direct alias for x if y is assignable  to
x or bindable to x.

     Aliases (x) may  now  be  defined  by  transitive  reflexive
closure:  for  any  entity x, Aliases (x) is the set containing x
itself and all entities y such that, for some z, y  is  a  direct
alias for z and (recursively) z belongs to Aliases (x).

     The above definitions consider all assignments  and  routine
calls present in the system regardless of whether they may or may
not be executed in actual runs of the system. So no flow analysis
will  be  needed  to  enforce the type rule; as announced in 4.1,
this will mean a pessimistic policy, although a less  pessimistic
one  than  policies 1 and 2. (The inclusion of all descendants of
Sa and Sb in the above rules is also pessimistic;  the  aim  here
was  to  keep  the  definitions  simple at no significant loss of
useful systems. The interested reader may  wish  to  examine  how
these conditions can be relaxed.)


6 IMPLEMENTING THE RULE


[This section is only relevant for readers who are interested  in
the details of the incremental verification mechanism. Others are
invited to proceed to the last section.]

     As introduced so far, the  computation  of  Aliases (x)  for
every  entity of a system does not take into account the order of
compilation.   As  discussed  in  section   2.4,   an   essential
requirement   is   the  ability  for  the  automatic  incremental
compilation mechanism to check a new  class  without  re-checking
the  classes  on which it depends - its suppliers and ancestors -
if  they  have  already  been  checked.   Although   the   Eiffel
incremental  recompilation algorithm is not in the public domain,
we may outline the strategy here.

     The first observation is that case B.2, which seems the most
formidable   one,   can   safely   be  ignored.  By  conceptually
introducing a new entity z we can rewrite the corresponding  call
as
     z := b.y(t);
     a.r (z)

making B.2 a combination of A.2 and B.1.

     The  incremental  compilation  requirement   precludes   the
straightforward implementation of type checking, whereby we would
compute Aliases (x) for every entity x, and  then  examine  every
feature call x.f (z) to see if it satisfies the type rule.  True,
this is appropriate for A.2: whenever  this  case  yields  a  new
direct  alias  y  for  x,  then the class of x is a client of the
class of y, so that the set Aliases (x)  may  be  updated  in  an
order  compatible  with the compilation order.  But this does not
work for case B.1, where it is in fact the class of y which is  a
client of the class of x. (In case A.1 x and y belong to the same
class.)

     As a consequence we must adopt a mixed strategy; the sets to
be  examined  must  be  computable  and  adaptable  in  an  order
compatible  with  the  dependency   relation   (and   hence   the
compilation order). For every entity x, these sets will be:

     + Remote_aliases (x), the set of y for which A.2 holds;  for
       convenience, this set also contains x.

     + Required_exports (x), the set of routines r which must  be
       exported  by  the class of x because the system contains a
       call z.f (t) for some z such that x is in Aliases (z).

     + Required_types (x), the set of types C  to  which  the  Sx
       must  conform  because the system contains a call z.f (t),
       where f requires a formal argument of type C, for  some  z
       such that x is in Aliases (z).

     The strategy to follow conceptually involves four steps  for
each new class.

     Step 1 initializes the above sets for each entity x  of  the
class:  the first set is initialized to the set containing just x
itself; the second, to the set of f such that x.f  (...)  appears
in  the  class;  the  last,  to  the  types  of the corresponding
arguments.

     Step  2  propagates  aliasing  information  to  descendants.
Whenever  a  new class HEIR is defined as heir to an existing one
PARENT, step 2 performs the following for every entity x of  HEIR
(where x' is the corresponding entity in PARENT):

     +    Remote_aliases (x)    :=    Remote_aliases (x)        U
       Remote_aliases (x')

     +   Required_exports (x)   :=   Required_exports (x)       U
       Required_exports (x')

     +    Required_types (x)    :=    Required_types (x)
       Required_types (x')

     This step implements the rule given in the previous  section
for both the ``assignable'' and ``bindable'' relations: any alias
of an entity of class PARENT is considered to be also an alias of
the corresponding entities in descendants of PARENT.

     For each new class, step  3  will  perform  the  incremental
computation  of  the three above sets and step 4 will verify that
these sets satisfy the type rule. As will be  seen  below,  these
two steps may be performed simultaneously on each class.

     Step 3, when encountering any of the first  three  types  of
potential  aliasing  introduced  above,  must  update the sets as
follows:

A.1
     +   Required_exports (y)   :=   Required_exports (y)       U
       Required_exports (x)

       Required_types (y)     :=     Required_types (y)
       Required_types (x)

A.2
     +    Remote_aliases (x)    :=    Remote_aliases (x)        U
       Remote_aliases (y)

B.1
     +   Required_exports (y)   :=   Required_exports (y)       U
       Required_exports (x)

       Required_types (y)     :=     Required_types (y)
       Required_types (x)

     In each of these cases the updating has been chosen so  that
it only affects some sets A (x) for an entity x in the new class,
not in its (possibly old) suppliers.

     Using  the  above  updates,  the  given  sets  are  computed
incrementally  in  an  order which is compatible with inter-class
dependencies, so that the computation may be integrated with  the
automatic compiling mechanism.

     Once this has been done, step 4 establishes  the  type  rule
for  every  new  class C by checking the following two properties
for every entity x in C:

P1
     + Every routine r in Required_exports (y), for  every  y  in
       Remote_aliases (x), is exported by C.

P2
     +  Sa  belongs  to   Required_types (y)   for   all   y   in
       Remote_aliases (x).

     In practice, step 4 may be performed at  the  same  time  as
step 3 on each class C. Property P1 may be verified incrementally
by  checking  in  cases  A.1  and   B.1   that   r   belongs   to
Required_exports (y).   P2  is verified incrementally by checking
in cases A.1 and B.1 that Sa belongs to Required_types (y).

     This  incremental  checking  assumes   that   some   initial
verification  has been integrated with steps 1 and 2. The details
are  easily  filled.  It  is  interesting  to  note   that   this
initialization  of the complete checking process is precisely the
set of verifications performed by the  current  Eiffel  compiler,
reflecting  the  first and second rules described in section 3.2.
As this discussion as shown, the full  checking  strategy  is  an
extension to this partial approach.


7 ASSESSMENT


To conclude, it is useful to examine a few possible objections to
the rules and enforcement strategy described above.

     A first possible objection is that the algorithm sketched in
the  previous  section  is  too  difficult  to integrate with the
current Eiffel compiler. Of course, the only way to  refute  this
objection  would be to present the implementation, which is still
some time away. The objection  can  only  be  made,  however,  by
someone  who  does  not realize the sophistication of some of the
tasks already performed  by  the  current  Eiffel  compiler  (for
example the implementation of repeated inheritance).

     A related objection addresses the cost of the new  checking,
especially in space. The precise impact needs to be assessed more
carefully,  but  I  believe  the  checking  can   reasonably   be
integrated  within  the  normal  compiling mechanism. If not, the
mechanism would still be available as a  compiler  option  a`  la
Lint, to be applied to systems at validation time.

     Perhaps the most  serious  objection  has  to  do  with  the
relative  complexity  of the mechanism described in the preceding
section. But this mechanism is of no interest to normal users  of
the  language;  only  implementors  (and  others  doubtful of the
soundness  and  enforceability  of  the  type  rules)   need   to
understand it. Eiffel programmers should only learn the type rule
of section 5.3. This rule is only slightly more complex than  the
old  type rule (first rule of section 3.2), which implies it; the
advantage of the new rule should be clear to programmers  reading
a  simplified  form  of the ``rationale'' section of this article
(4). An attractive property of the incremental strategy described
in  the  last  section is that it appears to make it possible for
the checker to produce clear and precise error messages,  of  the
form
     ``In line m, feature f is applied to x; however, because  of
     the  assignment on line n, x might dynamically be of type D,
     which does not export f.''

     If, as seems to be the case, the Eiffel compiler  can  reach
this  level  of  sophistication,  then  we will have achieved our
goal. The purpose of a typed language is the same as the  purpose
of  the  programming  environment  of which it is a part: to help
programmers, not stand in their way. The  sole  purpose  of  type
restrictions  is  to boost programmer productivity by enabling an
automatic checker to catch errors that would be difficult  for  a
human being to detect.

     The liberal typing policy designed for Eiffel (``policy 3'')
was  designed to make programmers' life easier. The type failures
to  which  it  may  lead  are  results  of  abnormal  situations.
Detecting  such  situations  statically  is  a tricky and tedious
business;  as  any  such  business,  it  should  be  handled   by
computers, not by humans.


Acknowledgments

     I am deeply  indebted  to  all  the people  who  contributed
to the type debate, including  those  with  whose  conclusions  I
disagree, for encouraging me to clarify the matter  of  types  in
Eiffel.   As  far  as  I  can  remember, Mike Ladner from Schmidt
Associates was the first to point out the  problems  clearly,  in
private discussions with me. Apart from Philippe Elinck's student
thesis, Pierre America was the first to discuss them in print (in
his  review  of  my book), and his article, together with William
Cook's  article,  may  still  be  the  best  exposition  of   the
``opposing''  view.   I also thank Ralph Johnson for sharing with
me (in  private  mail)  some  useful  insights  gained  from  his
experience with Typed Smalltalk.

[1]  Pierre  America,  "Review   of   `Object-Oriented   Software
     Construction'", Science of Computer Programming, 1989.

[2]  Luca Cardelli, "A Semantics  of  Multiple  Inheritance",  in
     Semantics  of  Data Types, ed. Gilles Kahn, David B. McQueen
     and Gordon Plotkin, pp. 51-67,  Springer-Verlag,  Berlin-New
     York, 1984.

[3]  William Cook, "A  Proposal  for  Making  Eiffel  Type-Safe",
     ECOOP Conference, 1989.

[4]  Philippe  Elinck,  "De   la   Conception-Programmation   par
     Objets",  Me'moire  de  Licence en Informatique (BS Thesis),
     Universite' Libre de Bruxelles, Brussels, 1988.

[5]  Bertrand  Meyer,  Object-Oriented   Software   Construction,
     Prentice-Hall, 1988.

[6]  Bertrand Meyer, "Eiffel:  The  Language",  Technical  Report
     TR-EI-17/RM,  Interactive  Software  Engineering Inc., Santa
     Barbara  (Calif.),  August  1989.   (To  be   published   by
     Prentice-Hall in 1990.)

[7]  Bertrand Meyer, An Introduction to the Theory of Programming
     Languages, Prentice-Hall, 1989.  To appear.

[8]  Bertrand Meyer, "Eiffel: An Introduction", Technical  Report
     TR-EI-3/GI,  Interactive  Software  Engineering  Inc., Santa
     Barbara (Calif.), 1986 (Last revision: 1990).

[9]  Craig Schaffert, Topher Cooper, Bruce  Bullis,  Mike  Kilian
     and  Carrie  Wilpolt,  "An  Introduction to Trellis-Owl", in
     OOPSLA '86 Conference Proceedings, Portland  (Oreg.),  Sept.
     29-Oct.  2,  1986,  pp.  9-16,  1986.  (Published as SIGPLAN
     Notices, 21, 11, Nov. 1986.)
-- 
-- Bertrand Meyer
bertrand@eiffel.com