Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!AMES.ARC.NASA.GOV!atari!apratt
From: atari!apratt@AMES.ARC.NASA.GOV (Allan Pratt)
Newsgroups: gnu.gcc.bug
Subject: Suboptimal optimizations
Message-ID: <8906280007.AA11178@atari.UUCP>
Date: 28 Jun 89 00:07:30 GMT
Sender: daemon@tut.cis.ohio-state.edu
Distribution: gnu
Organization: GNUs Not Usenet
Lines: 46

GCC 1.35 for 68000 with -O uses "cmpw #0,a2" to test an address register
for zeroness, which is 10 clocks, while "movel a2,d0" is only four.  The
destination register is a throwaway; the condition code is what counts. 
However, if you've got them to burn, why not?

In addition, clrl is sometimes used to clear a data register, when
"moveql #0" is quicker on a 68000 and a tie on 68020.  I already
mentioned this, and you (RMS) said clrl is more general (can have memory
as an ea), which is true, but not a good answer for an optimizer.  I
guess I'll have to learn 'md' codes to figure out where to add a line to
the 68000 md for this one if you don't want to. 

Here is sample source & assembly output for the cmpw vs movel
optimization:

****************************************
main()
{
    long *y;

    while (y) {
	do_something(*y);
    }
}
****************************************
#NO_APP
gcc_compiled.:
.text
	.even
.globl _main
_main:
	link	a6,#0
	movel	a2,sp@-
	cmpw	#0,a2		| the offending line; movel a2,d0 works
	jeq	L5
L4:
	movel	a2@,sp@-
	jbsr	_do_something
	addqw	#4,sp
	cmpw	#0,a2		| the offending line; movel a2,d0 works
	jne	L4
L5:
	movel	a6@(-4),a2
	unlk	a6
	rts
****************************************