Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!mips!apple!ames!vsi1!zorch!amiga0!mykes
From: mykes@amiga0.SF-Bay.ORG (Mike Schwartz)
Newsgroups: comp.sys.amiga.programmer
Subject: Re:     Lemmings - a tutorial Part V (last)
Message-ID: <mykes.1028@amiga0.SF-Bay.ORG>
Date: 1 Apr 91 10:14:03 GMT
References: <mykes.0774@amiga0.SF-Bay.ORG> <1991Mar31.003933.1483@mintaka.lcs.mit.edu> <mykes.0926@amiga0.SF-Bay.ORG> <1991Apr1.020748.26863@mintaka.lcs.mit.edu>
Organization: Amiga makes it possible
Lines: 118

In article <1991Apr1.020748.26863@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:

>  Obviously you have no idea of how advanced today's optimizing compilers
>are.  The code you stepped through must have been produced by some
>1970's MetaComco compiler or something. But FYI, most of todays compilers
>can pass arguements in registers, allocate memory without stack, eliminate
>the frame registers, and even do all the non-obvious tricks of sign extension,
>etc. Check out the code I've included at the end compiled by GCC.
>
>The following was compiled with GCC -O -fstrength-reduce -fomit-frame-register
>I don't have SAS C on the Amiga, but I'm sure it produces simular results.
>
>/* test.c */
>
>char buf[20];
>main()
>{
>  char *d=(char *)&buf;
>  const char *s="This is a test\n";
>  while(*s) { *d++=*s++; }
>}
>
>/* Test.s produced by gcc */
>
>#NO_APP
>gcc_compiled.:
>.text
>LC0:
>	.ascii "This is a test\12\0"
>	.even
>.globl _main
>_main:
>	lea _buf,a1
>	lea LC0,a0
>	tstb a0@
>	jeq L5
>L4:
>	moveb a0@+,a1@+
>	tstb a0@
>	jne L4
>L5:
>	rts
>.comm _buf,20
>

/* Test.s produced by gcc */

#NO_APP
gcc_compiled.:
.text
LC0:
	.ascii "This is a test\12\0"
	.even
.globl _main
_main:					; (cycles)
	lea _buf,a1			; 8
	lea LC0,a0			; 8
	tstb a0@			; 8
	jeq L5				; 8
L4:
	moveb a0@+,a1@+			; 14*12
	tstb a0@			; 14*8
	jne L4				; 13*10+1*8
L5:
	rts				; 16
.comm _buf,20


;/* test.s produced by me */		; (cycles)
	lea	text(pc),a0		; 8
	lea	buf(pc),a1		; 8
.loop	move.b	(a0)+,(a1)+		; 14*12
	bne.s	.loop			; 13*10+1*8
	rts				; 16
text	dc.b	'This is a test',10,0
buf	ds.b	20

NO COMPARISON DUDE!  GCC makes 3 totally wasted instructions, and one of
them is inside your loop.  Try an example with nested loops and your
wasted clock cycles become a geometric progression.  Multiply the kind
of inefficiencies that GCC demonstrates here by EVERY loop and EVERY
function you have and your program is slower and bigger than it needs
to be.  To be specific, the GCC routine is 6 words longer and by the
time it is done executing, it will take 128 more clock cycles than
mine will (on a 68000).  Your routine takes 466 total clocks to execute,
mine takes 338.  I'm just your average 68000 assembler language programmer,
but I saved 28% CPU time.  You might also note that your 'C' source is
7 lines of code and so is my assembler code.

Just think, the OS ROM routines are written in 'C' and compiled with
a lesser compiler than gcc (5 years ago).


>  Also, it's not the language but the algorithm that is responsible for
>how fast a routine runs. Compare a C coded Boyer-Moore string search
>with an Assembly coded brute-force byte by byte search, the C code would
>probably win without optimizations turned on.
>

You should compare apples to apples.  Compare your Boyer-Moore string search
in assembler language with the one in 'C' on the Amiga.  How about comparing
an assembler coded Boyer-Moore string search against a 'C' coded brute-force
byte by byte search?  You'd complain it's not fair either.

>
>
>--
>/~\_______________________________________________________________________/~\
>|n|   rjc@albert.ai.mit.edu   Amiga, the computer for the creative mind.  |n|
>|~|                                .-. .-.                                |~|
>|_|________________________________| |_| |________________________________|_|

--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************