Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!usc!cs.utexas.edu!tut.cis.ohio-state.edu!comp.vuw.ac.nz!jonathan From: jonathan@comp.vuw.ac.nz Newsgroups: gnu.gcc.bug Subject: (LONG) Complaint of 1.35 code quality on m68k; gcc 1.35.95 even worse. Message-ID: <8908030401.AA27128@comp.vuw.ac.nz> Date: 3 Aug 89 04:01:23 GMT Sender: daemon@tut.cis.ohio-state.edu Distribution: gnu Organization: GNUs Not Usenet Lines: 619 Disclaimer:: The problem reported here is not strictly a bug, in that the code produced by gcc is correct, though significantly worse than I would have expected. I don't want to fix this myself because I don't fully understand what cse.c does; so mailing to this list is the best way to reach other GCC debuggers and maintainers who might be interested in fixing it. Machine:: hp 300 (mc68030) running 4.3bsd-tahoe (actually, MORE/bsd) Gcc versions:: 1.35, 1.35.95 Description:: Gcc produces horrible code for the expansions of the ORIG_EXPR and FOLDED_EXPR macros of the following program. It seems that gcc 1.35 is poor at optimising compile-time constant expressions involving relational operators when the constant expressions involve relational operators. Re-writing the source code of ORIG_EXPR to FOLDED_EXPR using boolean algebra enables gcc 1.35 to produce better code, at a significant decrease in readability. The code for char_g has nine insns; the code for silly_char_g has twelve, one of which is an unecessary multiply. The code that gcc-1.35.95 produces for the same program is also correct, but has even worse code for char_ge(). Compiler Options:: gcc -O -fstrength-reduce -finline-functions Sample Program:: /* Demonstrate ugly problem with constant propagation on mc68020 in gcc 1.35. 1.35.95 does even worse. Compile with: gcc -v -S -O -ffinline-functions -fstrength-reduce Compilation Options: -DTEST_LOGIC: verify the by-hand ``simplification'' of the offending expression. Notice the stupid loads and compares in char_le, char_ge. Notice the even worse code generated for silly_cmp_{l,g,le,ge}. Gcc does not propagate constants introduced by function inlining into relational expressions in which they appear; though it has notice the constants are constant and loaded them into registers!!! Re-arranging the expression as a series of conditional expressions helps, but not much; gcc will still load two constants into a register and compare them. Silly, silly. This example is extracted from a real program I care about: profiling shows it occurs in the inner loop of my lisp programs. */ /* three way compare: Compare two characters, returning one of -1, 0, or 1 for less than, equal, or greater than. */ extern const int char_cmp(); /* The original macro for the (logical) expression ((s*x) < -1), where: s is known to be either 1 or -1 t is known to be either 1 or 0 x could be any of 1,0,-1 (the code works for any value of x) */ #define ORIG_EXPR(s,t,x) (((s)*(x)) < (t)) /* This expression, believe it or not, evaluates to exactly the same thing, provided the constraints on s and t hold. GCC's constant folding seems to do better with this version, but it is still nothing to write home about. */ #define FOLDED_EXPR(s,t,x) \ ({ int _x = (x); int _ans= \ ((t) ? ( (s==1) ? ((_x) < 1) : ( (-(_x)) < 1)) \ : ( (s==1) ? ((_x) < 0) : ( (-(_x)) < 0))); \ _ans; }) static inline char_cmp1 (const int s, const int t) { return (s*char_cmp() < t); } static inline char_cmp2 (const int s, const int t) { return (FOLDED_EXPR(s, t, char_cmp())); } #ifdef TEST_LOGIC /*** test driver ***/ /* Keep the linker happy. This will never be called. */ char_cmp () { return 1; } #define TEST_THEM(s, t, x) \ printf ("orig_expr (%2d %2d %2d)= %d\n", s,t,x, ORIG_EXPR(s, t, x)); \ printf ("folded_expr(%2d %2d %2d)= %d\n", s,t,x, FOLDED_EXPR(s, t, x));\ printf ((ORIG_EXPR(s,t,x)==FOLDED_EXPR(s,t,x)) ? "\n": "\t\t**BAD!***\n") /* Build truth table to prove the two expressions are equivalent. */ main () { int i; for (i =-1 ;i < 2; i++) { TEST_THEM (1, 1, i); TEST_THEM (1, 0, i); TEST_THEM (-1, 0, i); TEST_THEM (-1, 1, i); } } #else !TEST_LOGIC /*** demonstrate the inefficient code gcc 1.35 generates. ***/ /* These functions use the original definition of character compare. The code produced is terribly inefficient: gcc produces code to do the multiply and compare, even though s and t are constants. */ silly_cmp_l () { return char_cmp1 (1, 1); } silly_cmp_g () { return char_cmp1 (-1, 1); } silly_char_le() { return char_cmp1( 1, 0); } silly_char_ge() { return char_cmp1(-1, 0); } /* Replacing the multiply and compare with an (ugly, nested) ternary expression does a little better; but the constants do not get propagated into the inner ternary expression. */ char_l () { return char_cmp2 (1, 1); } char_g () { return char_cmp2 (-1, 1); } char_le() { return char_cmp2( 1, 0); } char_ge() { return char_cmp2(-1, 0); } #endif !TEST_LOGIC /* end of sample program */ Assembler Output (from 1.35):: #NO_APP gcc_compiled.: .text .even .globl _silly_char_l _silly_char_l: link a6,#0 movel d2,sp@- moveq #1,d2 jbsr _char_cmp cmpl d0,d2 sgt d0 andl d2,d0 movel a6@(-4),d2 unlk a6 rts .even .globl _silly_char_g _silly_char_g: link a6,#0 moveml #0x3000,sp@- moveq #-1,d2 moveq #1,d3 jbsr _char_cmp mulsl d0,d2 # why multiply by -1??? cmpl d2,d3 sgt d0 andl d3,d0 moveml a6@(-8),#0xc unlk a6 rts .even .globl _silly_char_le _silly_char_le: link a6,#0 jbsr _char_cmp tstl d0 slt d0 moveq #1,d1 andl d1,d0 unlk a6 rts .even .globl _silly_char_ge _silly_char_ge: link a6,#0 movel d2,sp@- moveq #-1,d2 jbsr _char_cmp mulsl d0,d2 # why multiply by -1?? smi d0 moveq #1,d1 andl d1,d0 movel a6@(-4),d2 unlk a6 rts .even .globl _char_l _char_l: link a6,#0 jbsr _char_cmp tstl d0 sle d0 moveq #1,d1 andl d1,d0 unlk a6 rts .even .globl _char_g _char_g: link a6,#0 jbsr _char_cmp negl d0 tstl d0 sle d0 moveq #1,d1 andl d1,d0 unlk a6 rts .even .globl _char_le _char_le: link a6,#0 movel d2,sp@- moveq #1,d2 jbsr _char_cmp cmpl d2,d2 jne L79 tstl d0 slt d0 andl d2,d0 jra L76 L79: negl d0 smi d0 moveq #1,d1 andl d1,d0 L76: movel a6@(-4),d2 unlk a6 rts .even .globl _char_ge _char_ge: link a6,#0 movel d2,sp@- moveq #-1,d2 jbsr _char_cmp moveq #1,d1 cmpl d2,d1 jeq L97 negl d0 L97: tstl d0 slt d0 moveq #1,d1 andl d1,d0 movel a6@(-4),d2 unlk a6 rts Diffs to 1.35.95 output:: *** scratch-gcc-1.35.s Mon Jul 31 21:12:05 1989 --- scratch-gcc-1.35.95.s Mon Jul 31 21:11:46 1989 *************** *** 108,119 **** jbsr _char_cmp moveq #1,d1 cmpl d2,d1 ! jeq L97 ! negl d0 ! L97: tstl d0 slt d0 moveq #1,d1 andl d1,d0 movel a6@(-4),d2 unlk a6 --- 108,122 ---- jbsr _char_cmp moveq #1,d1 cmpl d2,d1 ! jne L95 tstl d0 slt d0 + jra L97 + L95: + negl d0 + smi d0 moveq #1,d1 + L97: andl d1,d0 movel a6@(-4),d2 unlk a6 Subject: (LONG) Complaint of 1.35 code quality on m68k; gcc 1.35.95 even worse. -------- Disclaimer:: The problem reported here is not strictly a bug, in that the code produced by gcc is correct, though significantly worse than I would have expected. I don't want to fix this myself because I don't fully understand what cse.c does; so mailing to this list is the best way to reach other GCC debuggers and maintainers who might be interested in fixing it. Machine:: hp 300 (mc68030) running 4.3bsd-tahoe (actually, MORE/bsd) Gcc versions:: 1.35, 1.35.95 Description:: Gcc produces horrible code for the expansions of the ORIG_EXPR and FOLDED_EXPR macros of the following program. It seems that gcc 1.35 is poor at optimising compile-time constant expressions involving relational operators when the constant expressions involve relational operators. Re-writing the source code of ORIG_EXPR to FOLDED_EXPR using boolean algebra enables gcc 1.35 to produce better code, at a significant decrease in readability. The code for char_g has nine insns; the code for silly_char_g has twelve, one of which is an unecessary multiply. The code that gcc-1.35.95 produces for the same program is also correct, but has even worse code for char_ge(). Compiler Options:: gcc -O -fstrength-reduce -finline-functions Sample Program:: /* Demonstrate ugly problem with constant propagation on mc68020 in gcc 1.35. 1.35.95 does even worse. Compile with: gcc -v -S -O -ffinline-functions -fstrength-reduce Compilation Options: -DTEST_LOGIC: verify the by-hand ``simplification'' of the offending expression. Notice the stupid loads and compares in char_le, char_ge. Notice the even worse code generated for silly_cmp_{l,g,le,ge}. Gcc does not propagate constants introduced by function inlining into relational expressions in which they appear; though it has notice the constants are constant and loaded them into registers!!! Re-arranging the expression as a series of conditional expressions helps, but not much; gcc will still load two constants into a register and compare them. Silly, silly. This example is extracted from a real program I care about: profiling shows it occurs in the inner loop of my lisp programs. */ /* three way compare: Compare two characters, returning one of -1, 0, or 1 for less than, equal, or greater than. */ extern const int char_cmp(); /* The original macro for the (logical) expression ((s*x) < -1), where: s is known to be either 1 or -1 t is known to be either 1 or 0 x could be any of 1,0,-1 (the code works for any value of x) */ #define ORIG_EXPR(s,t,x) (((s)*(x)) < (t)) /* This expression, believe it or not, evaluates to exactly the same thing, provided the constraints on s and t hold. GCC's constant folding seems to do better with this version, but it is still nothing to write home about. */ #define FOLDED_EXPR(s,t,x) \ ({ int _x = (x); int _ans= \ ((t) ? ( (s==1) ? ((_x) < 1) : ( (-(_x)) < 1)) \ : ( (s==1) ? ((_x) < 0) : ( (-(_x)) < 0))); \ _ans; }) static inline char_cmp1 (const int s, const int t) { return (s*char_cmp() < t); } static inline char_cmp2 (const int s, const int t) { return (FOLDED_EXPR(s, t, char_cmp())); } #ifdef TEST_LOGIC /*** test driver ***/ /* Keep the linker happy. This will never be called. */ char_cmp () { return 1; } #define TEST_THEM(s, t, x) \ printf ("orig_expr (%2d %2d %2d)= %d\n", s,t,x, ORIG_EXPR(s, t, x)); \ printf ("folded_expr(%2d %2d %2d)= %d\n", s,t,x, FOLDED_EXPR(s, t, x));\ printf ((ORIG_EXPR(s,t,x)==FOLDED_EXPR(s,t,x)) ? "\n": "\t\t**BAD!***\n") /* Build truth table to prove the two expressions are equivalent. */ main () { int i; for (i =-1 ;i < 2; i++) { TEST_THEM (1, 1, i); TEST_THEM (1, 0, i); TEST_THEM (-1, 0, i); TEST_THEM (-1, 1, i); } } #else !TEST_LOGIC /*** demonstrate the inefficient code gcc 1.35 generates. ***/ /* These functions use the original definition of character compare. The code produced is terribly inefficient: gcc produces code to do the multiply and compare, even though s and t are constants. */ silly_cmp_l () { return char_cmp1 (1, 1); } silly_cmp_g () { return char_cmp1 (-1, 1); } silly_char_le() { return char_cmp1( 1, 0); } silly_char_ge() { return char_cmp1(-1, 0); } /* Replacing the multiply and compare with an (ugly, nested) ternary expression does a little better; but the constants do not get propagated into the inner ternary expression. */ char_l () { return char_cmp2 (1, 1); } char_g () { return char_cmp2 (-1, 1); } char_le() { return char_cmp2( 1, 0); } char_ge() { return char_cmp2(-1, 0); } #endif !TEST_LOGIC /* end of sample program */ Assembler Output (from 1.35):: #NO_APP gcc_compiled.: .text .even .globl _silly_char_l _silly_char_l: link a6,#0 movel d2,sp@- moveq #1,d2 jbsr _char_cmp cmpl d0,d2 sgt d0 andl d2,d0 movel a6@(-4),d2 unlk a6 rts .even .globl _silly_char_g _silly_char_g: link a6,#0 moveml #0x3000,sp@- moveq #-1,d2 moveq #1,d3 jbsr _char_cmp mulsl d0,d2 # why multiply by -1??? cmpl d2,d3 sgt d0 andl d3,d0 moveml a6@(-8),#0xc unlk a6 rts .even .globl _silly_char_le _silly_char_le: link a6,#0 jbsr _char_cmp tstl d0 slt d0 moveq #1,d1 andl d1,d0 unlk a6 rts .even .globl _silly_char_ge _silly_char_ge: link a6,#0 movel d2,sp@- moveq #-1,d2 jbsr _char_cmp mulsl d0,d2 # why multiply by -1?? smi d0 moveq #1,d1 andl d1,d0 movel a6@(-4),d2 unlk a6 rts .even .globl _char_l _char_l: link a6,#0 jbsr _char_cmp tstl d0 sle d0 moveq #1,d1 andl d1,d0 unlk a6 rts .even .globl _char_g _char_g: link a6,#0 jbsr _char_cmp negl d0 tstl d0 sle d0 moveq #1,d1 andl d1,d0 unlk a6 rts .even .globl _char_le _char_le: link a6,#0 movel d2,sp@- moveq #1,d2 jbsr _char_cmp cmpl d2,d2 jne L79 tstl d0 slt d0 andl d2,d0 jra L76 L79: negl d0 smi d0 moveq #1,d1 andl d1,d0 L76: movel a6@(-4),d2 unlk a6 rts .even .globl _char_ge _char_ge: link a6,#0 movel d2,sp@- moveq #-1,d2 jbsr _char_cmp moveq #1,d1 cmpl d2,d1 jeq L97 negl d0 L97: tstl d0 slt d0 moveq #1,d1 andl d1,d0 movel a6@(-4),d2 unlk a6 rts Diffs to 1.35.95 output:: *** scratch-gcc-1.35.s Mon Jul 31 21:12:05 1989 --- scratch-gcc-1.35.95.s Mon Jul 31 21:11:46 1989 *************** *** 108,119 **** jbsr _char_cmp moveq #1,d1 cmpl d2,d1 ! jeq L97 ! negl d0 ! L97: tstl d0 slt d0 moveq #1,d1 andl d1,d0 movel a6@(-4),d2 unlk a6 --- 108,122 ---- jbsr _char_cmp moveq #1,d1 cmpl d2,d1 ! jne L95 tstl d0 slt d0 + jra L97 + L95: + negl d0 + smi d0 moveq #1,d1 + L97: andl d1,d0 movel a6@(-4),d2 unlk a6