Path: utzoo!mnetor!uunet!husc6!uwvax!oddjob!mimsy!chris From: chris@mimsy.UUCP (Chris Torek) Newsgroups: comp.lang.c Subject: Re: Bad optimization in PCC? Message-ID: <11206@mimsy.UUCP> Date: 25 Apr 88 12:14:36 GMT References: <793@amnesix.liu.se> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 87 Summary: not ambitious enough In article <793@amnesix.liu.se> mikpe@amnesix.liu.se (Mikael Pettersson) writes: >While hacking a piece of code the other day, I came across what >looks like a case of bad optimization in PCC-derivated compilers. [Note that Mikael means `poor', not `incorrect', optimisation] [Sun version, trimmed:] >L18: tstb a3@ > (+)jeq L19 ... >L19: tstb a3@ > (*)jne L20 > movl a6@(12),a5@ >L20: > >It seems to me that a *good* peep-hole optimizer should be able to >recognize that if the jump at line (+) is taken then the jump at line >(*) will not be taken. As long as a3@ is not `volatile', or local equivalent. >So why didn't they generate something like: >L18: tstb a3@ > (+)jeq L19 ... >L19: movl a6@(12),a5@ >L20: The peephole optimisers we have are simply not that ambitious. The Vax c2 comes close; it carries ccodes around (sort of) and notices branches to tests whose result is already known (see my recent bug fix for branches to tests whose result is discarded). But it is a bit scared of tests on memory locations, lest it break code that depends on variables being volatile: register char *cp; ... while (*cp == 0) /*void*/; /* wait for signal */ compiles to (Vax) L17: tstb (r11) jeql Ln The result is `obvious', so c2 `ought' to change this to L17: tstb (r11) L2000000:jeql L2000000 which is *really* an infinite loop. Instead it just leaves memory references alone. I do not know why GCC, which *does* have volatile, does not catch that one. Incidentally, if you rewrite the original code > while(*np && (*np++ == *ep++)) > /* empty */ ; > if(!*np) > *envp = val; as do { if (*np == 0) { *envp = val; break; } } while (*np++ == *ep++); it compiles to (again, Vax) L18: tstb (r11) jneq L17 movl -4(fp),(r9) jbr L16 L17: cmpb (r11)+,(r10)+ jeql L18 L16: (although why you would care that much about a microsecond in `setenv' I am not sure...). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris