Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!cs.utexas.edu!asuvax!stjhmc!f14.n15.z1.fidonet.org!Dave.Harris From: Dave.Harris@f14.n15.z1.fidonet.org (Dave Harris) Newsgroups: comp.lang.c Subject: Can analysis detect undefined expressions? Message-ID: <14206.285B7688@stjhmc.fidonet.org> Date: 16 Jun 91 14:18:25 GMT Sender: ufgate@stjhmc.fidonet.org (newsout1.26) Organization: FidoNet node 1:15/14 - Nibbles 'n Bits, Orem UT Lines: 68 >From: ckp@grebyn.com (Checkpoint Technologies) >The subject line is not very good. Let me explain what I want. >Some expressions in C are undefined, because of expression reordering and >side effects. A recent (ongoing?) thread discussed the possible different >interpretations of if ((i=1) == (i=2)), and there are other examples like >i = i++ + i++, and so on. >I'd like to know if there has been any attempt to diagnose such undefined >expressions. It seems like an exceedingly difficult thing to do, >especially considering aliasing and possible side effects of functions. To say the least. It can PROBABLY be done on local variables using operators the compiler understands (ie the primitives). As soon as you throw subroutines in where the compiler really doesn't know all possible sources of the parameters when compiling the subroutine or what the subroutine itself does to the variables (assuming you pass them as addresses) at the expression containing the call. For the easy case, it should not be unthinkable hard see if the same address is being modified in more than one sub-expression in a group of sub-expressions who's order of evaluation does not matter. So in C we have what... the comparison operators (== <= >= !=), the math operators (+ - * / % | & etc) and the , "operator". There might be more, I don't recall. An extended example so that I can think clearly here: (j = ((i=1) == (i=2))) == (j = ((i=3) == (i=4))) Assumedly, i can end up as 1,2,3, or 4. j should be 0. The grouping is such that i=4, i=2, i=3, i=1 won't happen without breaking any laws. right? So the idea here is to scan for the problem operators such as the ==, determine the subexpression for both sides, then determine if a same address (not variable) is modified in both. Easier said than done, and something you don't want to do at compile time very badly, but it should be doable. Once again, this would only be good for the simpler cases. I would hate to estimate what percentage of the programmer mistakes fall under this type of example, its been a long time since I've made one this blantant. On the whole, I would say it is NOT possible to contend with every case. For examples. int func(x,y,t) { int *z; if (t); z = &x; else z = &y; t = (*x)++ == (*z)++; } This one much harder to contend with. Its being defined depends on the value of t being passed in. And it only gets worse.. Dave Harris. -- Uucp: ...{gatech,ames,rutgers}!ncar!asuvax!stjhmc!15!14!Dave.Harris Internet: Dave.Harris@f14.n15.z1.fidonet.org