Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!AMES.ARC.NASA.GOV!atari!apratt From: atari!apratt@AMES.ARC.NASA.GOV (Allan Pratt) Newsgroups: gnu.gcc.bug Subject: Suboptimal optimizations Message-ID: <8906280007.AA11178@atari.UUCP> Date: 28 Jun 89 00:07:30 GMT Sender: daemon@tut.cis.ohio-state.edu Distribution: gnu Organization: GNUs Not Usenet Lines: 46 GCC 1.35 for 68000 with -O uses "cmpw #0,a2" to test an address register for zeroness, which is 10 clocks, while "movel a2,d0" is only four. The destination register is a throwaway; the condition code is what counts. However, if you've got them to burn, why not? In addition, clrl is sometimes used to clear a data register, when "moveql #0" is quicker on a 68000 and a tie on 68020. I already mentioned this, and you (RMS) said clrl is more general (can have memory as an ea), which is true, but not a good answer for an optimizer. I guess I'll have to learn 'md' codes to figure out where to add a line to the 68000 md for this one if you don't want to. Here is sample source & assembly output for the cmpw vs movel optimization: **************************************** main() { long *y; while (y) { do_something(*y); } } **************************************** #NO_APP gcc_compiled.: .text .even .globl _main _main: link a6,#0 movel a2,sp@- cmpw #0,a2 | the offending line; movel a2,d0 works jeq L5 L4: movel a2@,sp@- jbsr _do_something addqw #4,sp cmpw #0,a2 | the offending line; movel a2,d0 works jne L4 L5: movel a6@(-4),a2 unlk a6 rts ****************************************