Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!ucsd!hub!eiffel!bertrand From: bertrand@eiffel.UUCP (Bertrand Meyer and Philippe Stephan) Newsgroups: comp.lang.eiffel Subject: The indexing clause Keywords: Industry of reusable software components Message-ID: <180@eiffel.UUCP> Date: 14 Jul 89 05:50:15 GMT Organization: Interactive Software Engineering, Santa Barbara CA Lines: 170 THE INDEXING CLAUSE Bertrand Meyer, Philippe Stephan Interactive Software Engineering Inc. bertrand@eiffel.com stephan@eiffel.com As mentioned in the recent message about innovations in 2.2, Eiffel classes may now begin with an ``indexing clause''. This note explains the purpose of the clause and raises some questions on which input from the Eiffel community will be welcome. The general principle of documenting Eiffel software is that as much of the documentation as possible should be within the software texts (that is to say, the class texts) themselves. Documentation tools (such as the ``flat'' and ``short'' commands) and class retrieval tools, also known as browsers (such as the ``eb'' and ``GOOD'' commands), should use these texts as their primary source of information. Some properties of class designs, however, are of a higher level of abstraction than what is usually expressed in the class text proper. They include indexing categories, descriptions of design and implementation decisions, references to algorithms or data structures as published in the literature etc. The indexing clause is meant to record them in a standardized format for use by tools for documentation, archival and retrieval. Although the tools of the environment make only limited use of this clause in version 2.2, we believe it will play a major part in the future. We think Eiffel will be the basis for the first real industry of software components, which should be the major development in software technology over the next fifteen years. But as hundreds of reusable software components become available, it will be essential to have a systematic way of indexing and retrieving them. The indexing clause should be the one of the key mechanisms for achieving this goal. The syntactical format is simple. If present, the clause comes at the beginning of a class and has the following form: indexing index: value, value, value, ...; index: value, value, value, ... ... class (etc.) The indices are identifiers; the values are identifiers or manifest constants (integers, strings, reals). In each subclause, the part ``index:'' is optional, so that some indexing values may be given without index. Each of the semicolon-separated components is called an entry of the indexing clause. Thanks to this clause, it should be possible to retrieve archived classes using query languages that express queries based on pairs, and the associated tools. Because of the very nature of the clause, the choice of indices and values is free. Obviously, we should not overly restrict the developers' choice in this area. If users are to be able to formulate meaningful and successful queries, however, it will be necessary to define consistent and systematic indexing conventions. Since we wanted the Data Structure Library, one of the key elements of the Basic Eiffel Library, to be properly ``indexed'' for 2.2, we have defined a first set of such conventions. They are not the result of very deep thinking, but seem to succeed in capturing the essential properties of the data structure library classes. We offer them here for discussion and improvement. It's too late for the improvements to affect 2.2, but history does not stop here. Suggestions and criticism will be welcome. Some of the guidelines were the following: + Keep the indexing clauses short (2 to 5 entries is typical). + Avoid repeating information which is in the rest of the class text. + Use a set of standardized indices for properties that apply to many structures (such as choice of representation). + For values, define a set of standardized possibilities for the common cases. + Include positive information only. For example, a ``representation'' index is used to describe the choice of representation (linked, array, ...). A deferred class does not have a representation. For such a class the clause should not contain an entry ``representation: none'' but simply no entry with the index ``representation''. A reasonable query language will make it possible to use a query pair of the form , where $NONE$ is a special value indicating absence. The indices chosen for the 2.2 data structure library are the following, along with the most frequent values. An entry of index ``names'' is used to record the names under which the corresponding data structures are known. Although a class has only one official name, the abstraction it implements may be known under other names. For example, a ``list'' is also known as a ``sequence''. An entry of index ``access'' records the mode of access of the data structures. Standard values include: ``fixed'' (only one element is accessible at any given time, as in a stack or queue); ``fifo'' (first-in-first-out policy); ``lifo'' (last-in-first-out); ``index'' (access by an integer index); ``key'' (access by a non-integer key); ``cursor'' (access through a client-controlled cursor, as with the list classes); ``membership'' (availability of a membership test); ``min, max'' (availability of operations to access the minimum or the maximum). Obviously, more than one of these values may be used in the same ``access'' entry. An entry of index ``size'' indicates a size limitation. Value ``fixed'' means the size of the structure is fixed at Create time and cannot be changed later (there are few such cases in the library). Value ``resizable'' means that an initial size is chosen but the structure may be resized (possibly at some cost) if it outgrows that size. For extendible structures without size restrictions this entry should not be present. An entry of index ``representation'' indicates a choice of representation. Value ``array'' indicates representation by contiguous, direct-access memory areas. Value ``linked'' indicates a linked structure. An entry of index ``contents'' is appropriate for container data structures. It indicates the nature of the contents. Possible values include generic (for generic classes), int, real, bool, char (for classes representing containers of objects of basic types). For example, the ARRAY_LIST class describes lists implemented by one or more arrays, chained to each other. The clause in this case is: indexing names: list, sequence; representation: array, linked; -- In this case it is both! access: fixed, cursor; size: resizable; contents: generic