Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!cs.utexas.edu!sdd.hp.com!uakari.primate.wisc.edu!aplcen!uunet!microsoft!jimad From: jimad@microsoft.UUCP (Jim ADCOCK) Newsgroups: comp.std.c++ Subject: Re: Packing Across Inheritance Boundari Message-ID: <56843@microsoft.UUCP> Date: 22 Aug 90 22:19:49 GMT References: <6540462@1990Aug3.211414.23872@watmath.wa> <13413@> Reply-To: jimad@microsoft.UUCP (Jim ADCOCK) Organization: Microsoft Corp., Redmond WA Lines: 124 In article <13413@> Chuc@ writes: | |>>>>> On 31 Jul 90 23:13:11 GMT, jeremy@cs.ua.oz.au (Jeremy Webber) said: |Jeremy> The ability for a compiler to re-order the elements of |Jeremy> structures/unions is a useful optimization. I would argue that any |Jeremy> code which writes structures to a data file is non-portable. | |True. Even if the ordering is identical, the alignment may vary. However, |alignment is not _likely_ to vary between systems using the same CPU, |although it still _can_ vary due to compiler differences. Even under C, the validity of this statement varies greatly depending on the CPU being used. Some CPUs dictate a particular alignment strategy. Other CPUs support multiple alignment strategies. Some C compilers even support options for multiple alignment strategies. A very common tradeoff is to pack to double-byte boundaries in order to have structures with at worse one byte holes, verses packing to quad-byte boundaries in order to have fast bus access on CPUs with 32-bit buses. C++ has much more issues relating to compatibility, which will be discussed below. |Jeremy> If data is to be portable it should *always* be written element by |Jeremy> element, preferably in ASCII. | |If your program isn't already I/O bound, this is an excellent strategy, and |I highly recommend it as the default strategy. | |Jeremy> The programmer should never assume specific structure ordering, |Jeremy> except in the case of bit field specifications, which can override |Jeremy> the compiler's defaults. | |Even this can fail, unless all the world is a 32 bit int. Seriously, many |programs have to bang hardware which may not pack data in the "optimized" |form. Perhaps this (and portability between compilers) is partly why order |of inheiritance became defined under C++ 2.0. I just went over all the layout specifications in E&S, and I believe this to be a misstatement -- at least as applied to what is called out in E&S. The only layout requirement *at all* I can find in E&S is that fields within a labeled access section must be at increasing addresses. E&S goes out of its way to specifically call all other layout issues "implementation dependent." Sections 10.1c etc spend a great deal of time covering a number of ways objects can be layed out. The order of initialization of inherited parents is called out. The order of layout of sections inherited from parents is not called out. |I also suspect reordering members between parents in a derived class such |that their elements are interleaved, would present a run time nightmare for |the C++ implementor. Perhaps one of the implementors out there would care |to comment. Not an implementer, but -- I don't think people are seriously suggesting packing across the inheritence hierarchy --- we're talking about packing up the inheritence hierarchy. Thus, if a derived class has two non- virtual parents, its not the parents that attempt to share space, but rather its the derived class that has the possibility to pack some of its members into holes left in either of the two parents. The implementation of this is not difficult. It just requires a convention that the compiler not mess with unused "holes" in a structure. And yes, this restriction could lead to slower code, leading to the traditional tradeoffs between speed and space. A compiler design tradeoff. Again, I think that the one remaining packing order restriction should be further restricted to the situation where one puts one's structure in an extern "C" statement, thus turning off name mangling, and explicitly stating one wants to have historical compatibility. Thus, if one puts extern "C" { } around a .h file, you maintain compatibility with the C world, and any prior successful mappings someone has found between a "C" structure, and machine registers. And/or compatibility with historical "C" libraries. A few of the reasons I think that C++ compilers are so unlikely to be compatible as to make packing order compatibility a moot issue: Does a given compiler pack to double-byte or quad-byte boundaries? Big-endian or little-endian? 16-bit or 32-bit ints? C or Pascal or register calling protocols? "this" passed in a register [which register?] or on the stack? Vtptrs or tagged pointers, or tagged objects? 0, 8, 16, 32, 48, or 64 bit Vtptrs? Embedded Objects or references? Derived contiguous with parent structures, or references? Order of layout of access labeled sections? Order of layout of parents? Vbase implementation? Method call via indirection, ptr fixup, double dispatch, hashed dispatch, fat tables, etc? 16, 32, 48, or 64 bit pointers? segmented or flat pointers? name mangled, if so, what encoding? "compatible" with C linkers, or a custom linker required? what libraries provided with the compiler? etc, etc, etc. In summary, there a many more ways to implement a good C++ compiler than there are vendors. The only way two compilers are going to be at all compatible is if vendors for a particular CPU get together and agree on a standard approach for that CPU. There is no hope of telling a vendor "the right way" to implement C++ for a given CPU -- that choice depends on the CPU and the goals of the vendor. Likewise, it is unlikely that object layouts are going to match -- becuase layout choices depend on the goals of the compiler. Thus, please leave compiler implementation details out of the language specification. -- Imagine how one might implement C++ on the Rekursiv architecture, if you want to get a different perspective on the language.