Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!mips!apple!ames!vsi1!zorch!amiga0!mykes From: mykes@amiga0.SF-Bay.ORG (Mike Schwartz) Newsgroups: comp.sys.amiga.programmer Subject: Re: Lemmings - a tutorial Part V (last) Message-ID: Date: 1 Apr 91 10:14:03 GMT References: <1991Mar31.003933.1483@mintaka.lcs.mit.edu> <1991Apr1.020748.26863@mintaka.lcs.mit.edu> Organization: Amiga makes it possible Lines: 118 In article <1991Apr1.020748.26863@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes: > Obviously you have no idea of how advanced today's optimizing compilers >are. The code you stepped through must have been produced by some >1970's MetaComco compiler or something. But FYI, most of todays compilers >can pass arguements in registers, allocate memory without stack, eliminate >the frame registers, and even do all the non-obvious tricks of sign extension, >etc. Check out the code I've included at the end compiled by GCC. > >The following was compiled with GCC -O -fstrength-reduce -fomit-frame-register >I don't have SAS C on the Amiga, but I'm sure it produces simular results. > >/* test.c */ > >char buf[20]; >main() >{ > char *d=(char *)&buf; > const char *s="This is a test\n"; > while(*s) { *d++=*s++; } >} > >/* Test.s produced by gcc */ > >#NO_APP >gcc_compiled.: >.text >LC0: > .ascii "This is a test\12\0" > .even >.globl _main >_main: > lea _buf,a1 > lea LC0,a0 > tstb a0@ > jeq L5 >L4: > moveb a0@+,a1@+ > tstb a0@ > jne L4 >L5: > rts >.comm _buf,20 > /* Test.s produced by gcc */ #NO_APP gcc_compiled.: .text LC0: .ascii "This is a test\12\0" .even .globl _main _main: ; (cycles) lea _buf,a1 ; 8 lea LC0,a0 ; 8 tstb a0@ ; 8 jeq L5 ; 8 L4: moveb a0@+,a1@+ ; 14*12 tstb a0@ ; 14*8 jne L4 ; 13*10+1*8 L5: rts ; 16 .comm _buf,20 ;/* test.s produced by me */ ; (cycles) lea text(pc),a0 ; 8 lea buf(pc),a1 ; 8 .loop move.b (a0)+,(a1)+ ; 14*12 bne.s .loop ; 13*10+1*8 rts ; 16 text dc.b 'This is a test',10,0 buf ds.b 20 NO COMPARISON DUDE! GCC makes 3 totally wasted instructions, and one of them is inside your loop. Try an example with nested loops and your wasted clock cycles become a geometric progression. Multiply the kind of inefficiencies that GCC demonstrates here by EVERY loop and EVERY function you have and your program is slower and bigger than it needs to be. To be specific, the GCC routine is 6 words longer and by the time it is done executing, it will take 128 more clock cycles than mine will (on a 68000). Your routine takes 466 total clocks to execute, mine takes 338. I'm just your average 68000 assembler language programmer, but I saved 28% CPU time. You might also note that your 'C' source is 7 lines of code and so is my assembler code. Just think, the OS ROM routines are written in 'C' and compiled with a lesser compiler than gcc (5 years ago). > Also, it's not the language but the algorithm that is responsible for >how fast a routine runs. Compare a C coded Boyer-Moore string search >with an Assembly coded brute-force byte by byte search, the C code would >probably win without optimizations turned on. > You should compare apples to apples. Compare your Boyer-Moore string search in assembler language with the one in 'C' on the Amiga. How about comparing an assembler coded Boyer-Moore string search against a 'C' coded brute-force byte by byte search? You'd complain it's not fair either. > > >-- >/~\_______________________________________________________________________/~\ >|n| rjc@albert.ai.mit.edu Amiga, the computer for the creative mind. |n| >|~| .-. .-. |~| >|_|________________________________| |_| |________________________________|_| -- ******************************************************** * Appendix A of the Amiga Hardware Manual tells you * * everything you need to know to take full advantage * * of the power of the Amiga. And it is only 10 pages! * ********************************************************