Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!bionet!ames!ucsd!ucsbcsl!eiffel!bertrand From: bertrand@eiffel.UUCP (Bertrand Meyer) Newsgroups: comp.lang.eiffel Subject: Feature names (Eiffel tip of the week #2) Keywords: Consistency Message-ID: <137@eiffel.UUCP> Date: 10 May 89 05:57:00 GMT Organization: Interactive Software Engineering, Santa Barbara CA Lines: 139 An important aspect of object-oriented design of reusable components is the proper choice of names for exported features of each class. The basic rule is that these names should be both simple (which usually implies that they should be short) and chosen according to consistent conventions. One consequence is that one should resist the temptation to over-qualify names. For example a procedure for inserting elements into a dictionary should not be called ``insert_in_dictionary'' or ``dictionary_insert'', but (barring any better choice, as discussed below) just ``insert''. This would not necessarily be true in a less typed language because of ambiguities and errors that might result if the same simple names (insert, delete, put, ...) are used in many different classes. In Eiffel, however, typing averts these problems. When you see d.insert (...) the type of d (as declared in the class in which this appears) immediately tells you which ``insert'' is meant. These ideas were applied to the design of the Basic Eiffel Library. We recently took a closer look at naming conventions for the library, however, especially after some criticisms were made regarding their consistency (see the presentation by John Anderson of Cognos at the recent Eiffel conference in Paris). For version 2.2 we have decided to take an extremist approach to name consistency by focusing on a small number of names, especially for ``container'' classes (those which describe data structures used as repositories of objects, such as sets, arrays, lists etc.). Examples of these basic names are at (for accessing an element) put (for inserting an element) force (same as ``put'', but will work in cases in which put might have failed; for arrays, for example, put only works for indices between the current bounds, whereas force applied to an out-of-bounds index will silently resize the array. This feature of arrays was previously called ``enter_force'') and so on. The names are used consistently, but the corresponding routines do not necessarily have identical signatures; for example: at (index: INTEGER): T in class ARRAY [T]: access to element through its index at: T in class STACK [T] and its descendants: access to top element at (key: U): T in class H_TABLE [T, U -> HASHABLE]: access to element through its key and so on. Of course synonyms may be needed for client programmers who want more specific terminology. In class STACK and its descendants, for example, a function called ``top'' is still available (as it was before) and yields the same result as ``at''. When different classes are combined through multiple inheritance, identically named features will be distinguished through renaming. For example the implementation of stacks by arrays is of the form class FIXED_STACK [T] export at, ... inherit ARRAY [T] rename at as array_at, ... STACK [T] feature nb_elements: INTEGER; -- Redefined from STACK as an attribute at: T is -- Last element pushed; -- same as top. require not_empty: not empty do Result := array_at (nb_elements) end; -- at ... end -- class FIXED_STACK Again, the typed nature of the language is essential here to make sure that any error due to a confusion between two identically named features (for example ``at'' from ARRAY and ``at'' from FIXED_STACK) is caught right away by the compiler. As a result, the vocabulary of recommended feature names for the library will significantly decrease. (I use the term ``recommended names'' because the old ones are usually kept as synonyms for compatibility; in a forthcoming message I will describe the 2.2 ``obsolete'' facility which helps in this respect.) It might be argued, of course, that the use of the same name for operations with different signatures (such as the three versions of ``at'' above) is confusing for programmers of client classes. We considered this argument but it does not seem to hold on closer inspection. Regardless of the names chosen, the client programmer who needs to access elements in arrays and stacks as well as hash tables (to continue using this example) must somehow master the information that: - For an array you must provide an integer index. - For a stack you don't provide any argument since you can only access the last element pushed (top). - For a hash table you must provide the key, which must be of ``hashable'' type defined for the table (e.g. STRING). Some effort is needed to understand and remember this information. If in addition the routine names are different, the effort required is higher, not lower. If instead you can rely on the systematic convention that regardless of the data structure standard access is always called ``at'', standard addition of an element is always called ``put'' and so on, then you can concentrate on learning the really meaningful differences: the signatures of the operations. -- -- Bertrand Meyer bertrand@eiffel.com