Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!ucsd!pacbell.com!tandem!zorch!xanthian From: xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) Newsgroups: comp.std.c++ Subject: Controlling structure layout (Re: Randomly ordered fields !?!?) Message-ID: <1990Sep6.194543.7685@zorch.SF-Bay.ORG> Date: 6 Sep 90 19:45:43 GMT References: <1990Aug28.173553@bert.llnl.gov> <1990Sep1.131041.15411@zorch.SF-Bay.ORG> <1990Sep4.163132@bert.llnl.gov> Sender: xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) Organization: SF Bay Public-Access Unix Lines: 139 howell@bert.llnl.gov (Louis Howell) writes: >xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes: >Now if there actually were a standard---IEEE, >ANSI, or whatever---then the compilers should certainly support >it. Recent comments in this newsgroup show, however, that there >isn't even a general agreement on what a standard should look like. [see below] >>[7.0 million lines of code...] >This is the only one of your arguments that I can really sympathize >with. I've never worked directly on a project of anywhere near that >size. As a test, however, I just timed the compilation of my own >current project. 4500 lines of C++ compiled from source to >executable in 219 seconds on a Sun 4. Scaling linearly to 7 million >lines gives 3.41e5 seconds or about 95 hours of serial computer >time---large, but doable. Adding in the human time required to >deal with the inevitable bugs and incompatibilities, it becomes >clear that switching compilers is a major undertaking that should >not be undertaken more often than once a year or so. Yeah, especially since code effort scales more like 1.5th power of LOC. It took 200 programmers + extra for managers 18 months to port that code across a compiler update, with lots of automated assistance. >The alternative, though, dealing with a multitude of different >modules each compiled under slightly different conditions, sounds >to me like an even greater nightmare. Misapprehension. As typical for a large business, this was hundreds of independently running programs, accessing common, commonly formatted data [and not in a shared memory, but shared database, but the concept of a shared memory system this large exists in practice for example in air traffic control], but it illustrates the problem of insisting on porting all of a large software suite at once because of a compiler change, your "solution" to the shared memory situation to which solution I was trying to take exception. >You can still recompile and test incrementally if you maintain >separate test suites for each significant module of the code. If >the only test is to run a single 7 million line program and see if >it smokes, your project is doomed from the start. (1/2 :-) ) Yep, every independent program had its own extensive set of regression tests, thus the ~300 man years of porting effort. >Again, most users don't work in this type of environment. A >monolithic code should be written in a very stable language to >minimize revisions. Here I must disagree. It is exactly in such huge mulligans of software that the maintenance cost reductions promised by object encapsulation offer the greatest rewards. In fact, as commented elsewhere here recently, it is only with the advent of such truely awesome piles of software that the frustrations of the software engineer have called out most loudly for the "silver bullet" that OOP is trying to provide. This _is_ the target for which we should be specifying our languages, even if the present experience with the news OOPLs is limited to considerably more modest programs as engineers "get their feet wet" in OOP. >Hey, I'm a user too! I do numerical analysis and fluid mechanics. >What I do want is the best tools available for doing my job. If >stability were a big concern I'd work in Fortran---C++ is considered >pretty radical around here. I think the present language is a >big improvement over alternatives, but it still has a way to go. But what has made FORTRAN so valuable to the (hard) engineering profession is exactly that the "dusty decks" still run. I doubt that the originators of FORTRAN envisioned _at_that_time_ a set of applications software that would outlast the century being written with the first compilers, but so it has proved. With the perspective of history to assist us, we know that stability makes for a useful language, and should try to make all the important decisions for long term utility of the objects written today as early as feasible, not put them off in the interests of granting "flexibility" to the compiler writer. Unlike the middle '50's, today we have a plethora of highly experienced compiler writers to guide our projects; we can depend that a lot of the "best" ways, or ways good enough to compete well with the eventual best are already in hand. This doesn't deny the possibility of progress, or even breakthroughs, nor suggest that either be prevented. Instead, let's install the mechanisms now that will let today's objects be the "dusty decks" of the 2010's, while leaving options to bypass those mechanisms where some other goal (speed, size, orthogonality) is more important to a piece of code than stability. >As a compromise, why don't we add to the language the option of >specifying every detail of structure layout---placement as well >as ordering. This will satisfy users who need low-level control >over structures, without forcing every user to painfully plot >out every structure. Just don't make it the default; most people >don't need this capability, and instead should be given the best >machine code the compiler can generate. And here at last, I think we agree. Let the compiler writers have a ball. Just give me a switch, like the optimization switches now common, to turn it all off and preserve my own explicit control over structure layout if that is a real need for my application (or just for me to go to bed with a warm fuzzy feeling that I need not expect a 2AM "the system just upchucked your code" call) to have that control. [At last answering the first quoted paragraph:] But, to have control of structure layout as described in this thread (across time, files, memory space, and comm lines (or some subset -- not worth arguing over)), there needs to be _now_ an agreed standard for what the specification of the layout of a structure that I write _means_, bit by bit, byte by byte. As Ron noted, C "allows" arbitrary amounts of padding between fields in a structure, but "nobody" does anything but the sensible single or double word alignment padding. Let's pick one layout now in use (take the obvious hits if there is a big endian, little endian conflict), and make it the standard (or, make the standard such that I can force e.g. "two byte alignment", "four byte alignment" or whatever, so long as I am consistent about it among the modules accessing the data; sounds like an object candidate to me! ;-), and publish it for all compiler writers to implement as a choice allowed forthe user who needs this level of control. There has been enough common practice in C structure layout implementations to observe and adopt some part of it by now. Again, this is just the standard for what I mean when _I_ take control of laying out a structure. If I give that control to the compiler writer, I'd better make no assumptions at all in my code about the result, because it is explicitly allowed to be "unstandard", and I have chosen to write at a high level and delegate those details to the compiler writer's ingenuity. Peace? Kent, the man from xanth.