Path: utzoo!news-server.csri.toronto.edu!rutgers!sun-barr!newstop!exodus!rbbb.Eng.Sun.COM!chased From: chased@rbbb.Eng.Sun.COM (David Chase) Newsgroups: comp.std.c++ Subject: Tags, typecodes, experience with these things. Summary: Calm down a bit, please Message-ID: <9661@exodus.Eng.Sun.COM> Date: 12 Mar 91 19:30:05 GMT References: <1399@culhua.prg.ox.ac.uk> <27D3E544.619A@tct.uucp> <4338@lupine.NCD.COM> <1991Mar12.123811.13701@kodak.kodak.com> Sender: news@exodus.Eng.Sun.COM Organization: Sun Microsystems, Mt. View, Ca. Lines: 91 I've been trying to follow this discussion, but there's a bit too much heat and a little too little content. Could someone please sit down and take the time to summarize what appear to be the two positions? Now, a small contribution. I implemented the C-generating back-end for Modula-3 back when Olivetti had a research center in Menlo Park. The language specified a "TYPECODE" for each type. These were implemented with substantial support from the language system. The major differences (for purposes of this discussion) between M-3 and C++ are: 1) C++ has multiple inheritance, Modula-3 does not. 2) Modula-3 has more "interesting" ways to do information-hiding (Strictly speaking, the compiler might not know where member fields or functions are located. This can be resolved at (pre-)link time.) 3) M-3 has no constructors or destructors (yet). 4) In M-3, ALL member functions are virtual. Things to note: a) If all member functions are virtual, the run-time cost for typecodes is as small as it can get. That information can be stored in the virtual function table at little cost. b) The (known) lookup algorithms for single inheritance are much simpler and cheaper than those for multiple inheritance. "Is x a T" can be implemented with two comparisons. "TYPECASE" is more complex, but take time O(log # cases). The data structure implementing these things can be updated in the event of dynamically-loaded code (with new and old types in it) in a multi-threaded system, with minimal synchronization requirements (readers must see writes in the same order than writers make them -- NARROW sometimes takes longer in the event that a race is lost, but it never lies). c) TYPECASE is NOT equivalent to "switch". d) TYPECODES are NOT necessarily repeated from run to run. Thus, they are not useful by themselves for saving and restoring pickled data. (What if you relink in a different order? The data should still be good, etc.) We used this information to help implement a "pickling" (storing data to disk) system. The pre-linker also generated "fingerprints" which were in fact large (63-bit?) numbers carefully hashed from the structure of a type, and these served to identify types from run to run. (We cheated, and did not also write out the type strings, so there was a vanishingly small probability of our algorithm being thwarted.) At run-time, the interface went from [type or object] to typecode to type-fingerprint. The typecodes might vary, but the fingerprints did not. To read the data back in, fingerprints were mapped to typecodes which gave access to a (low-level) interface to virtual function tables, object size information, user-defined (*) unmarshalling code, etc. (*) the default marshalling and unmarshalling code was automatically generated by a language-processing tool. It is also possible to perform a slower interpreted unmarshalling based on the structure of the type. Again, this information is stored behind an interface which makes use of TYPECODEs. A second use for typecodes was in implementing a verifiable type case ("NARROW") and TYPECASE. One can argue that TYPECASE is not object-oriented (I understand that argument, and TYPECASE clearly becomes less necessary if you have correctly working multiple inheritance), but NARROW appears to be necessary fairly often in C++ code (perhaps less so when templates are added, but I don't have any experience with that, so I can't be sure). We didn't use typecodes for garbage collection, but that was because (i) it would have been too inefficient and (ii) we never had the time to do that much engineering of the garbage collector. In principle, however, the information to help the GC could be attached to the virtual function table along with the typecode. ---------------------------------------------------------------- In summary, TYPECODEs were fantastically useful, even if their only use was to provide a nice name for a type in the interface to run-time information about types. Please consider the value of an interface to run-time type information that actually hides some of the implementation details. I also think that the language system should support these features, but that is probably a religious argument. Certainly, if nifty features are confined to objects with vptrs the costs can be kept low. David Chase Sun