Xref: utzoo comp.std.c++:670 comp.lang.c++:11980 Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!usc!elroy.jpl.nasa.gov!decwrl!infopiz!lupine!rfg From: rfg@NCD.COM (Ron Guilmette) Newsgroups: comp.std.c++,comp.lang.c++ Subject: Re: type/member tags (was Re: asking an object for its type) Message-ID: <4201@lupine.NCD.COM> Date: 2 Mar 91 23:01:25 GMT References: <27C2D973.3C1B@tct.uucp> <27C95D3A.1715@tct.uucp> Followup-To: comp.std.c++ Organization: Network Computing Devices, Inc., Mt. View, CA Lines: 116 I am cross-posing this response to comp.lang.c++ since it contains information about a general problem that many C++ programmers (and not just ones who are worried about the evolving standard) are concerned about. In article <27C95D3A.1715@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes: >According to dsouza@optima.cad.mcc.com (Desmond Dsouza): >>Here are a few examples where you need to know the type of an object: > >"Need?" That's a red flag for me... :-) > >>1. Persistent objects: When reading in one of these from disk, you >> need to know what constructor to call. Hence you need to encode in >> the persistent image the ClassId of the object. > >Presumably, a well-designed class hierarchy will use a virtual >function to store objects; and a virtual function by definition will >already know the exact type of the object it is storing. As Joe Buck pointed out, storing (or transmitting) an object is *not* a problem which requires any sort of special identification of the type of an object, however retrieving (or receiving) an hunk of data which represents the previous contents of some unknown type of thing requires us to use some sort of agreed upon scheme whereby the transmitter sends some unique code with the data block to indicate to the receiver what type of C++ object the transmitted data came from. Assuming that both the transmitting program and the receiving program are written in C++, and assuming that we want the receiving program to perform (or to fake) the re-construction of the originally transmitted object, we need to have some set of globally unique "type codes" (which the transmitter and the receiver must agree upon). We faced exactly this problem when I was working at MCC on the ES-kit project. We had a distributed multiprocessor system within which we wanted to be able to migrate objects and to send "messages" (i.e. member function calls) from one processing node to an object residing on a different processing node. The solution we came up with was not terribly elegant. We ended up hacking the compiler (g++) to get it to provide the actual string representation of the class name to the OS kernel routine which was responsible for forwarding messages between nodes. The kernel then had to do a lookup of the class name string within a table of all class-name string known to the system in order to get a globally unique 32-bit integer valued "code" for that particular class type. This code was then shipped across the mesh as a part of the inter-node "packet" representing the "message" to the remote object. A much cleaner solution would have been to enlist the compiler & linker to assign the globally unique "type codes" up front, prior to run-time. This would have allowed us to avoid the (very expensive) table lookups which we did within the kernel for each transmitted message. Ada implementors are familiar with the problem of providing "globally unique" identifying codes for things. They face the same problem when they go to implement Ada exceptions. Various aspects of Ada effectively create a requirement for an internal set of globally unique identification "codes" for all of the Ada exceptions declared throughout an Ada entire program. In all Ada implementations I know of, these "globally unique" codes for declared Ada exceptions are, in effect, generated at link-time by the linker. For each Ada exception declared within an entire Ada program the Ada compiler generates one small (word sized) artificial variable. When it subsequently needs a unique code for the given exception, it simply uses the address of the associated "dummy" variable as the globally unique "code" for the given exception. Fortunately, the linker see to it that all of these "dummy" variables get allocated to different memory locations (during linking) so that the addresses of these exceptions (i.e. their "codes") are indeed globally unique. An identical scheme could be used to assign globally unique integer codes to each class type within an entire (linked) C++ program. For each class type declaration compiled, a C++ compiler (or translator) could generate a "dummy" variable with a particular (specially mangled) name. The address of that dummy variable could then be used as a globally unique identifier for the class type itself. For example, given: class C { /... }; void *vp; void example () { vp = typeof (class C); } A C++ translator could easily generate: struct C { //... }; int __C__typeof_dummy; void *vp; void __exampleV () { vp = (void *) &__C__typeof_dummy; } This approach would work for C++ *translators* only so long as they are connected to underlying C compilers which allocate uninitialized variables (such as "__C__typeof_dummy") to "common". ANSI C does not allow this practice, but virtually all K&R C compilers and many "ANSI" C compilers still do it anyway. -- // Ron Guilmette - C++ Entomologist // Internet: rfg@ncd.com uucp: ...uunet!lupine!rfg // New motto: If it ain't broke, try using a bigger hammer.