Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!bionet!ames!ucsd!ucsbcsl!eiffel!bertrand
From: bertrand@eiffel.UUCP (Bertrand Meyer)
Newsgroups: comp.lang.eiffel
Subject: Feature names (Eiffel tip of the week #2)
Keywords: Consistency
Message-ID: <137@eiffel.UUCP>
Date: 10 May 89 05:57:00 GMT
Organization: Interactive Software Engineering, Santa Barbara CA
Lines: 139


    An important aspect of object-oriented design of reusable components
is the proper choice of names for exported features of each class.
The basic rule is that these names should be both simple (which usually
implies that they should be short) and chosen according to consistent
conventions.

    One consequence is that one should resist the temptation to
over-qualify names. For example a procedure for inserting elements into a
dictionary should not be called ``insert_in_dictionary'' or
``dictionary_insert'', but (barring any better choice, as discussed
below) just ``insert''.

    This would not necessarily be true in a less typed language because of
ambiguities and errors that might result if the same simple names
(insert, delete, put, ...) are used in many different classes. In Eiffel,
however, typing averts these problems. When you see

    d.insert (...)

the type of d (as declared in the class in which this appears) immediately
tells you which ``insert'' is meant.

    These ideas were applied to the design of the Basic Eiffel Library.

    We recently took a closer look at naming conventions for the library,
however, especially after some criticisms were made regarding their
consistency (see the presentation by John Anderson of Cognos
at the recent Eiffel conference in Paris).
For version 2.2 we have decided to take an extremist approach to name
consistency by focusing on a small number of names, especially for
``container'' classes (those which describe data structures used as
repositories of objects, such as sets, arrays, lists etc.). Examples of
these basic names are

    at       (for accessing an element)
    put      (for inserting an element)
    force    (same as ``put'', but will work in cases in which put might
             have failed; for arrays, for example, put only works for
             indices between the current bounds, whereas force applied
             to an out-of-bounds index will silently resize the array.
             This feature of arrays was previously called ``enter_force'')  

and so on. The names are used consistently, but the corresponding routines
do not necessarily have identical signatures; for example:

    at (index: INTEGER): T        in class ARRAY [T]:
                                    access to element through its index

    at: T                        in class STACK [T] and its descendants:
                                    access to top element

    at (key: U): T                in class H_TABLE [T, U -> HASHABLE]:
                                    access to element through its key

and so on.

    Of course synonyms may be needed for client programmers who want
more specific terminology. In class STACK and its descendants, for example,
a function called ``top'' is still available (as it was before) and
yields the same result as ``at''. 


    When different classes are combined through multiple inheritance,
identically named features will be distinguished through renaming. For
example the implementation of stacks by arrays is of the form

class FIXED_STACK [T] export

    at, ...

inherit

    ARRAY [T]
        rename
            at as array_at,
            ...

    STACK [T]
feature


    nb_elements: INTEGER;
            -- Redefined from STACK as an attribute

    at: T is
            -- Last element pushed;
            -- same as top.
        require
            not_empty: not empty
        do
            Result := array_at (nb_elements)
        end; -- at

    ...

end -- class FIXED_STACK


    Again, the typed nature of the language is essential here to make sure
that any error due to a confusion between two identically named features
(for example ``at'' from ARRAY and ``at'' from FIXED_STACK) is caught right
away by the compiler.

    As a result, the vocabulary of recommended feature names for the
library will significantly decrease. (I use the term ``recommended names''
because the old ones are usually kept as synonyms for compatibility; in a
forthcoming message I will describe the 2.2 ``obsolete'' facility which
helps in this respect.)

    It might be argued, of course, that the use of the same name for
operations with different signatures (such as the three versions of ``at''
above) is confusing for programmers of client classes. We considered
this argument but it does not seem to hold on closer inspection.
Regardless of the names chosen, the client programmer who needs
to access elements in arrays and stacks as well as hash tables
(to continue using this example) must somehow master the information that:

    - For an array you must provide an integer index.

    - For a stack you don't provide any argument since you can only
      access the last element pushed (top).

    - For a hash table you must provide the key, which must be of
      ``hashable'' type defined for the table (e.g. STRING).

    Some effort is needed to understand and remember this information.
If in addition the routine names are different, the effort required is
higher, not lower. If instead you can rely on the systematic convention
that regardless of the data structure standard access is always
called ``at'', standard addition of an element is always called ``put'' and
so on, then you can concentrate on learning the really meaningful
differences: the signatures of the operations.


-- 

-- Bertrand Meyer
bertrand@eiffel.com