Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!aplcen!haven!adm!cmcl2!kramden.acf.nyu.edu!brnstnd From: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) Newsgroups: comp.lang.misc Subject: Re: Whether cc after f2c can optimize arrays as well as f77 can Message-ID: <5853:Nov720:22:4890@kramden.acf.nyu.edu> Date: 7 Nov 90 20:22:48 GMT References: <2779:Nov608:20:1990@kramden.acf.nyu.edu> <5085@lanl.gov> Organization: IR Lines: 191 In article <5085@lanl.gov> jlg@lanl.gov (Jim Giles) writes: > It would actually never occur to me to post something out of a private > email conversation onto the net without the other person's permission. That is an amusing accusation: I posted something *I* wrote, with no more than paraphrases from you. But let's get to technical details. Here is a very precise description of how a C compiler can detect the sort of non-aliasing that a Fortranish program has. The compiler sees a prototype foo(a1;a2;...;an;v1;v2;...;vn;...) where a1 through an are all the same pointer type, v1 through vn are all void pointers or any other pointer parameters that may by the language definition be aliased with the a's. It also sees global arguments of the same or void pointer type, g1 through gn. As you may verify, the compiler can make a complete list of all such variables, provided that there is a prototype in scope. For simplicity let's rename all these variables x1 through xn. Whenever the compiler sees a function, it makes a list of possibly aliased variables as above. It makes one such list for each pointer type, but we (and it) can consider just one at a time. Via some fixed mechanism that does not depend on any further information, it selects a set of partitions of the x's. For instance, it might choose the following partitions out of four variables, x1 through x4: { {x1}, {x2}, {x3}, {x4} } { {x1,x2,x3,x4} } { {x1,x2}, {x3,x4} } { {x1,x3}, {x2,x4} } { {x1,x4}, {x2,x3} } It must include the partition that has every element in a single set. (The ``Fortran Case'' is where it includes the every-element-in-one-set partition and the every-element-in-separate-set partition, and no others. I claim that this is sufficient to detect the nonaliasing in a program that has been converted from Fortran.) When the compiler is compiling the function: It produces one version for each partition. In the version for a particular partition, it may assume that all pointer references through variables separated by the partition are unaliased. In the simplest case, where all variables are in the same set, this gains it absolutely nothing. In the Fortran Case, when it is compiling the second version, it can assume that all these variables are unaliased, just as in Fortran. (The compiler might name the results sort of the way C++ does.) When the compiler is compiling a call to the function: It now uses the partitions as requirements, with the same meaning for aliasing as above. If it thinks all the x's might be aliased, it has to generate code that calls the simplest version. The crucial point is that in a correct translation of a correct Fortran program, it knows that none of the arguments are aliased, so it can choose code that calls the all-elements-separated version. Jim, please say *something* that shows you understand the above description. Try the fully worked-out example below if you want a concrete case to stare at. > From article <2779:Nov608:20:1990@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein): > > In article <4869@lanl.gov> jlg@lanl.gov (Jim Giles) writes: > > [...] > > : [ file1: int a; int b; main() { ... sub(&a); ... } ] > > : [ file2: extern int a; sub(x) int *x; { ... } ] > > : [ file3: extern int b; sub(x) int *x; { ... } ] > These three separate file descriptions I sent (in a somewhat more > legible form) as an example of the fact that _some_ decision _must_ > be made at load time with respect to aliasing. This is not correct. The compiler makes all the decisions at compile time, as described above. Now for the seventh or eighth time, here's how it applies to the three-file case above: There are two global integer variables, a and b, with which *x might be aliased. When the compiler sees the prototype for file2's sub(), it chooses (for example) the following partitions: { {&a,&b,x} } Note that it knows &a and &b aren't aliased, but there's neither a need nor a way to express that fact in this case. { {&b} {&a,x} } { {&a} {&b,x} } { {&a} {&b} {x} } When it compiles file2, it produces four versions, one for each of the above aliasing assumptions. As it turns out, file2 doesn't have b in scope, so two of the versions will never be used no matter what else goes on, but this is a side issue. When it compiles main(), it says ``Aha! x and &a are aliased here. So I have to use a partition containing {&a,x}, but I don't need one with {&b,x}.'' So it calls the version compiled for the partition { {&b} {&a,x} }, which as I've explained assumes &a and x aliased. In other words, IT CORRECTLY FIGURES OUT AT COMPILE TIME TO USE THE SLOW CODE IN FILE2. In contrast, if it had used the file3 version, it would generate a call to the same partition. But file3 can then see that &b and x are not aliased. In other words, IT CORRECTLY FIGURES OUT AT COMPILE TIME TO USE THE FAST CODE IN FILE3. I am getting really, really, really sick of trying to explain this to you; I hope my excessively detailed and formal approach here does the job. I will give up if you persist in ignoring the truth. > version of 'sub' in file2, then the argument and the global would be > aliased in that context - either slow code or unsafe optimizations > must be accepted. If the version of 'sub' from file3 is loaded, > then this problem doesn't arise and we would prefer the optimized Yes, that is EXACTLY what happens. > This was his proposed solution which he claims solves the problem > without any loader intervention (indeed, without load-time action > of any kind). This scheme requires NO changes to the loader. It will make temporary .o files larger, I admit---a justifiable loss for being able to detect aliasing at compile time without bothering the programmer. Certainly you admit that libraries should have fast and slow versions when aliasing can make a large difference. > Yes, I think you must. At this point, you have 6 object modules. Incorrect. I have three. One for each file, as always. > But, wait a minute!! How did I decide that? I did it by looking the > the relevant code in the caller and the callee (an interprocedural > analysis - I always said there'd be one of those). No, this requires no interprocedural analysis. The ultimate proof of this is that my scheme *can* be used for libraries. (Of course, the compiler can save a lot of code generation if it *does* use interprocedural analysis to figure out which partitions it needs. Perhaps the right partitions could be dynamically added to the system library when user code asks for them. Who knows. The full method does not generate sufficient overhead for this to be a big problem.) > Further, the loader itself could have made the > decision if the compiler passed the right information along (which > is what I always recommended). The compiler *has* passed the right information along. In the particular, all-important-to-you case of Fortran array manipulations, the compiler can detect aliasing *without any help from the programmer*, because all arrays are passed as simple arguments and it's trivial to detect what data flows where. > HOW DOES THE USER _KNOW_ what to assert > with all these declaraions? It requires an interprocedural analysis. Wrong, and wrong. The user does not do ANYTHING, and there is NO interprocedural analysis required. You are becoming a bit repetitive. > > : The most flexible (and arguably the best) solution is to let the > > : programmer assert the lack of aliasing between any variables at any > > : time. Q provides sufficiently general mechanisms for this. > As long as the _LOADER_ checks the consistency of these assertions > across procedure calls, this is indeed adequate. The compiler *must* do this check, again because libraries may not be loaded with programs for a while. You're arguing in a separate thread that the loader should also do this check, but I still don't see your justification. In any case, we both agree that it's a benefit to allow certain classes of assertions to be part of the function prototype. > Feedback from such > _LOADER_ tests would be sufficient to tell the user the interprocedural > information needed (though, through trial and error). Trial and error? What a ridiculous idea. Why do you want to burden the user with any of this? > By the way, as I've pointed out before, the _LACK_ of aliasing > should be the default - the presence of possible aliasing should > require the assertion. The user *NEVER* has to make an assertion to the compiler in order to handle this case. > It makes sense to make the most commonly > desired configuration be the default. You always ask for the > reverse of that. Sir, you appear to be lying. But I've promised myself not to drag you over the coals for your repeated misquotes, misparaphrases, misleading, and apparent outright lies. ---Dan