Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site utcsri.UUCP Path: utzoo!utcsri!greg From: greg@utcsri.UUCP (Gregory Smith) Newsgroups: comp.lang.c Subject: Re: Legal re-arrangement rules. Why they seem inconsistent. Message-ID: <4352@utcsri.UUCP> Date: Fri, 13-Mar-87 12:40:08 EST Article-I.D.: utcsri.4352 Posted: Fri Mar 13 12:40:08 1987 Date-Received: Sat, 14-Mar-87 01:22:23 EST References: <844@wanginst.EDU> <4211@utcsri.UUCP> <609@viper.UUCP> <4315@utcsri.UUCP> <681@jenny.cl.cam.ac.uk> Reply-To: greg@utcsri.UUCP (Gregory Smith) Organization: CSRI, University of Toronto Lines: 139 Keywords: associative commutative operations. Summary: "Author's Response" In article <681@jenny.cl.cam.ac.uk> am@cl.cam.ac.uk (Alan Mycroft) writes: >Summary: The re-assocation rules are at best arbitrary, often not identities >and do not allow rearrangement that many users expect to be allowed. > >In article <4315@utcsri.UUCP> you [Greg] write: >>I want my compiler to be free to change (i+3)*4+6 >>to i*4+18 (more on this later). If I don't want that, there are means >>of avoiding it. >I'm afraid neither ANSI nor K&R sanction this particular transformation. >They allow re-arrangement of associative operators. Nothing is said about >your wish that mathematical 'distributive' laws are applicable. >In particular, were (e.g. i = 0x8000000) i*4 to overflow but (i+3)*4 not >to do and the compiler to signal on signed overflow as the standard permits >then this transformation is INVALID and a compiler which did it not >conforming. Of course, if signed overflow is ignored on a two's complement >machine then this transformation IS again valid, but by appeal to the 'as if' >principle - not the re-association rules. Fine - I don't have an ANSI draft so I was assuming. The V7 PDP-11 compiler did do this sort of thing ( at least, it could transform i*24+j*4 into (i*6+j)*4 which is often better ). I know the 4.2BSD vax compiler doesn't do this, and it generates some very clumsy multi-dim array code as a result. The 'as if' principle is important too - for example, if you have char ch1,ch2,ch3; ch1 = ch2+ch3; then the rules say that ch2 and ch3 are extended to ints before the addition. A compiler may note that only the lower 8 bits of the result are used, and therefore demote the + to a char add and subsequently discard the extensions. You won't find many compilers that do this sort of thing. It can make a very big difference on an 8-bit machine, but is a lot of trouble. > >>It allows constants to be folded to the end (a+2)+(b+9)+4 = (a+b)+15. >1.I disagree in principle that ANSI and K&R ought to allow this transformation >if a & b are floating. Consider a=1e30, b=-1e30. Then the transformation is Agreed. I never meant to imply that any of this should be done for floating point operations. Your example values are interesting though - the transformation is invalid because it makes the result more accurate! >2.Anyway, even if we concede that the spec. will/does allow this, we find that >it can re-arrange ch+(-'a')+'A' but not the more natural ch-'a'+'A'. >(Nothing is said about re-arranging thing with '-' in -- it is not assocative >or commutative). A compiler may change ex-k1 to ex+k2 with a suitable change in the constant k. Consider the PDP11 which has no 'and' operation, but has a 'bit clear' operation (~s)&d. If I write i & = 077, I had better get a single bit-clear operation with the constant ~077. This is the same type of transformation. Again, I don't know whether this is alowed by ANSI but I can't see why it shouldn't be, since it can have no effect on the result (other than the indirect effect of allowing ch-'a'+'A' to be rearranged, which should have no effect on the result other than the confounded overflow). >>Multiple constants in running sums tend to pop up (1) from macro >>expansions (2) in expressions like (p+3)->foo.y[i-1].abc ( after the >>semantics have been applied ). >But there is only one '+' in here so the compiler can't re-arrange it. >(Ansi is unclear whether the '+' introduced by e1[e2] can be re-arranged - >probably not since + on pointers is NOT assocative for type reasons). >Certainly the + introduced by component selection is not re-arrangeable. But these optimizations are normally done after the expression has been converted to a pure calculation, when the compiler has 'forgotten' where things came from. In the example I give, if p is a pointer and ->foo.y is an array, the value of the expression is calculated as *(p + 3 * s1 + k1 + k2 + (i-1) * s2 + k3) ...where s1 is the sizeof(*p), s2 is the size of ->foo.y[0], and k1,k2,k3 are the offsets of foo, y and abc. There is an implied left-to-right associativity of the +'s in the above: (p+3)->foo is at address ((p + 3*s1) + k1), etc. This should be reducible to *(p + i*s2 + kk ) for suitable kk. If you want to leglisate against such an optimization, you better have a darn good reason. It seems apologetic for me to say "Well, compilers forget where things came from". I have shown (I hope) that compilers should be allowed to make changes based on distributivity properties, at least within 'hidden' operations like the last example. Perhaps this should not be allowed for explicit calculations. This would require some kind of internal structure to remember where things came from. Even that is unclear: in the above example, I have explicitly subracted '1' from i, and the net effect in the final form was to reduce kk by the amount s2. Do you want the compiler to forced to subtract 1 from i: (p + (i-1)*s2 + kk') because the subtraction was explicit? I don't think so. Even if you did, that is what the unary + operator is for. Why not reduce the whole expression to a homogenous tree and optimize it 'wholesale' after the semantics have been applied? >>And once you convert arbitrary additions into rearrangable running sums, >>it becomes very attractive to convert things like (i+7)*4 + 2 into >>i*4 + 28 + 2 into i*4 + 30. Again, this comes up a *lot* with array >>operations, and again, if it breaks a program, that program was probably >>doomed anyway. >Again, the spec. is unclear about whether multiple array subscripts >can be folded/re-arranged. They typically involve + and * and as I said >above such re-arrangement is forbidden. > >Now, what you have written is probably current in many compilers (and >probably valid provided they are two's complement and never signal overflow). >But the points I am trying to make are that ANSI/K&R do no allow them and >that the re-arrangement rules seem inherently ill-thought out. > >Alan Mycroft I can't say that K&R forbids this. A strict reading of section 7 in the appendix tells me that they sort of maybe do forbid it. But the version 7 compiler, written by K&R, does these transformations. We can't really read that much into K&R these days, with ANSI around. The V7 and PCC compilers together form a better language definition than K&R when it comes to picking nits. A language defined by its compilers. My attitude is a defensive-programming one: I probably believe these rules to be rather more fuzzy and lenient than they really are. As a result, I will hopefully (ha!:-) never write an expression which will be rearranged in a a harmful way by any half-reasonable compiler, whether strictly conforming or not. At the other extreme, if you believe that there is a correct, unique and obvious way to evaluate any C expression, you are going to be debugging while I'm skiing. I will concede that such an approach shouldn't really be necessary, but I would rather see a compiler generate great code and have lenient rearrangement rules, than generate mediocre code and cater to sloppiness. Can we have both? In C, with rampant aliasing and side-effects, it seems unlikely. If a calculation must be done a certain way, and you as the programmer do not recognize that fact, you have made a mistake in my book. Whether you get away with it or not is another question. On the other hand, if you do recognize it, it is a simple matter to enforce it (and simultaneously flag the fact for those who maintain after you). -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...