Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!ucsd!pacbell.com!lll-winken!bert.llnl.gov!howell From: howell@bert.llnl.gov (Louis Howell) Newsgroups: comp.std.c++ Subject: Re: Randomly ordered fields !?!? (Was: Message-ID: <1990Sep4.163132@bert.llnl.gov> Date: 4 Sep 90 23:31:32 GMT References: <1990Aug27.152540@bert.llnl.gov> <1990Aug28.211752.24905@zorch.SF-Bay.ORG> <1990Aug28.173553@bert.llnl.gov> <1990Sep1.131041.15411@zorch.SF-Bay.ORG> Sender: usenet@lll-winken.LLNL.GOV Reply-To: howell@bert.llnl.gov (Louis Howell) Organization: Lawrence Livermore National Laboratory Lines: 170 In article <1990Sep1.131041.15411@zorch.SF-Bay.ORG>, xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes: |> howell@bert.llnl.gov (Louis Howell) writes: |> >In short, you want four types of compatibility: "comm links", "time", |> >"memory space", and "file storage". First off, "time" and "file |> >storage" look like the same thing to me, |> |> Not so. If a program compiled with compiler A stores data in a file, |> and a program compiled with compiler B can't extract it, that is one |> type of compatibility problem to solve, and it can be solved with the |> compilers at hand. |> |> But if a program compiled with compiler A revision 1.0 stores data in |> a file, and a program compiled with compiler A revision 4.0 cannot |> extract it, that is a compatibility problem to solve of a different |> type. Mandating no standard for structure layout forces programmers |> in both these cases to anticipate problems, unpack the data, and store |> it in some unstructured format. Tough on the programmer who realizes |> this only when compiler A revision 4.0 can no longer read the structures |> written to the file with compiler A revision 1.0; it may not be around |> any more to allow a program to be compiled to read and rewrite that data. I don't want to reduce this discussion to finger-pointing and name-calling, but I think this hypothetical programmer deserved what he got. I think it's a useful maxim to NEVER write anything important in a format that you can't read. This doesn't necessarily mean ASCII---there's nothing wrong with storing signed or unsigned integers, IEEE format floats, etc., in binary form, since you can always read the data back out of the file in these formats. If a programmer whines because he depended on some nebulous "standard structure format" and got burned, then I say let him whine. Now if there actually were a standard---IEEE, ANSI, or whatever---then the compilers should certainly support it. Recent comments in this newsgroup show, however, that there isn't even a general agreement on what a standard should look like. Let's let the state of the art develop to that point before we start mandating standards. |> [...] |> >As for "memory space", |> >I think it reasonable that every processor in a MIMD machine, whether |> >shared memory or distributed memory, should use the same compiler. |> |> That isn't good enough. I've worked in shops with several million lines |> of code (about 7.0) in executing software. By mandating _no_ standards |> for structure layout, you force that _all_ of this code be recompiled with |> every new release of the compiler, if the paradigm of data sharing is a |> shared memory environment. Again, by refusing to make one choice, you |> force several other choices in ways perhaps unacceptable to the compiler |> user. In this situation, that might well involve several man-years of |> effort, and it is sure to invoke every bug in the new release of the |> compiler simultaneously, and would very likely bring operations to a |> standstill. With no data structure layout standard, you have removed the |> user's choice to recompile and test incrementally, or else forced him to |> pack and unpack data even to share it in memory. This is the only one of your arguments that I can really sympathize with. I've never worked directly on a project of anywhere near that size. As a test, however, I just timed the compilation of my own current project. 4500 lines of C++ compiled from source to executable in 219 seconds on a Sun 4. Scaling linearly to 7 million lines gives 3.41e5 seconds or about 95 hours of serial computer time---large, but doable. Adding in the human time required to deal with the inevitable bugs and incompatibilities, it becomes clear that switching compilers is a major undertaking that should not be undertaken more often than once a year or so. The alternative, though, dealing with a multitude of different modules each compiled under slightly different conditions, sounds to me like an even greater nightmare. Imagine a code that only works when module A is compiled with version 1.0, module B only works under 2.3, and so on. Much better to switch compilers very seldom. If you MUST work that way, though, note that you would not expect the ordering methods to change with every incremental release. Changes like that would constitute a major compiler revision, and would happen only rarely. You can still recompile and test incrementally if you maintain separate test suites for each significant module of the code. If the only test is to run a single 7 million line program and see if it smokes, your project is doomed from the start. (1/2 :-) ) Again, most users don't work in this type of environment. A monolithic code should be written in a very stable language to minimize revisions. (Fortran 66 comes to mind. :-) The price is not using the most up to date tools. C++ just isn't old enough yet to be very stable. If I suggested changing the meaning of a Fortran format statement, I'd be hung from the nearest tree, and I'd deserve it, too. |> [...] |> >Finally, the issue of communication over comm links strikes me as |> >very similar to that of file storage. If compatibility is essential, |> >design the protocol yourself; don't expect the compiler to do it for |> >you. Pack exactly the bits you want to send into a string of bytes, |> >and send that. You wouldn't expect to send structures from a Mac |> >to a Cray and have them mean anything, so why expect to be able to |> >send structures from an ATT-compiled program to a GNU-compiled |> >program? If you want low-level compatibility, write low-level code |> >to provide it, but don't handicap the compiler writers. |> |> Same comments apply. In a widespread worldwide network of communicating |> hardware, lack of a standard removes the option to send structures intact. |> One choice (let compiler writers have free reign for their ingenuity in |> packing structures for size/speed) removes another choice (let programmers |> have free reign for their ingenuity in accomplishing speedy and effective |> communications). Somebody loses in each case, and I see the losses on |> the user side to far outweigh in cost and importance the losses on the |> compiler vendor side. I think Stephen Spackman's suggestion of standarizing the stream protocol, but not the internal storage management, is the proper way to go here. |> Then again, I write application code, not compilers, which could |> conceivably taint my ability to make an unbiased call in this case. ;-) Hey, I'm a user too! I do numerical analysis and fluid mechanics. What I do want is the best tools available for doing my job. If stability were a big concern I'd work in Fortran---C++ is considered pretty radical around here. I think the present language is a big improvement over alternatives, but it still has a way to go. If we clamp down on the INTERNAL details of the compiler now, we just shut the door on possible future improvements, and the action will move on to the the next language (D, C+=2, or whatever). C++ just isn't old enough yet for us to put it out to pasture. As a compromise, why don't we add to the language the option of specifying every detail of structure layout---placement as well as ordering. This will satisfy users who need low-level control over structures, without forcing every user to painfully plot out every structure. Just don't make it the default; most people don't need this capability, and instead should be given the best machine code the compiler can generate. Louis Howell #include