Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!att!att!andante!alice!wilber From: wilber@alice.att.com (rew) Newsgroups: comp.lang.c Subject: Re: low level optimization Message-ID: <20270@alice.att.com> Date: 26 Apr 91 04:05:50 GMT References: <21703@lanl.gov> <1991Apr18.124836.5686@watmath.waterloo.edu> <21818@lanl.gov> <3905@inews.intel.com> Reply-To: wilber@homxb.att.com Organization: AT&T, Bell Labs Lines: 88 J. Giles writes: >> [...] but why is it so horrible to you to document >> to the users of your optimized libraries that certain >> arguments to certain functions must be carefully >> non-aliased? [...] > >It's not horrible. You miss my point if you believe that I would >oppose that. You give me a portable way to declare to the compiler >that two (or more) arguments are assumed not to be aliased and I will >rejoice. > >The problem is that what people are recommending is _not_ portable >and is _not_ an explicit declaration (so the end user has to guess >which variables the compiler optimized in this way and which are >safe to alias). I like C. I don't like Fortran. I find the likelihood that people will still be coding in Fortran in the year 2020 to be depressing. Having said that, I must say that in my experience the criticism that J. Giles makes of C, that problems with potential aliasing make it *very* difficult for optimizing compilers to do their job, is a legitimate one. I am working on a large program that does schedule optimization. Speed is critical. Parts of it are written in C, and most is written in C++, which is translated into C by cfront. (BTW, cfront exacerbates the problems with aliasing by introducing large numbers of spurious pointers.) What I find is that Fortran code regularly runs circles around carefully coded C and C++. The problems intensify on RISC machines, which are becoming all the rage, because of their dependence on keeping data in registers for speed. On an IBM RS6000 a little test program I wrote in C runs about 6 times slower when I use pointers to scalars than when I use local scalars directly. One might expect that a factor of 2 could be due to the need to dereference the pointers, the rest is because most of the optimizations have to be turned off as soon as pointers are used. Unfortunately, once you get past the realm of toy test programs storing all your data in local variables is a non-option -- most of the data is lurking in giant arrays and malloced structures, all accessed through pointers. I think that the pro-C camp has been too willing to sweep under the rug the difficulties of making a C compiler deal with aliasing. Sure, global optimization across all 200 .c modules would solve many of the problems, but how many C compilers actually do that? Can you name 3 machines I can buy right now that have such a beast? And when you try to do that sort of thing, you quickly bump up against the limits of your machine. (Can you say, "Virtual memory exhausted"? Can you say, "Two day compilation"?) Cross procedural optimization within a module doesn't buy you as much as you might hope; simply knowing that none of the calls to foo() within foo.c alias the parameters doesn't help if foo() is declared external (the default) because some call to foo() in another module might alias the parameters. And I'd say that less than 10% of all functions in typical C code are declared static. (Even when functions aren't called outside of their modules, programmers rarely bother to declare them static.) So what *can* be done? As has already been pointed out, it is not unreasonable for a compiler to assume that non-char, non-void pointers of different types are not aliased. An option like -this_programmer_should_be_fired should of course be available to turn this assumption off, for things like compiling code taken from the net. I would go further and allow an option called, say, -strict_typedefs that would tell the compiler that in a case like this: typedef int Apple; typedef int Orange; that an int can be an Apple or an Orange (or neither) but an Apple can't be an Orange and an Orange can't be an Apple. Thus in typedef long Cents; typedef long Quantity; Cents calculate_costs(Cents c[], Quantity q[]); the compiler knows that c and q aren't aliased. This would of course have to be an option because there is nothing in the definition of the language that requires that typedefs should behave this way. But what can be done about cases like this? foo(char* p, char* q, char* s); One can of course come up with various schemes involving macros or comments that inform your favorite compiler that the parameters aren't aliased, but the problem is that there are no standards for how this is to be done, so any such information is likely to be understood *only* by your favorite compiler, and no other. Right now, in the Real World (TM), optimized Fortran really does run faster than optimized C. A situation that truly sucks. Bob Wilber wilber@homxb.att.com