Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!mips!pacbell.com!att!linac!mp.cs.niu.edu!ux1.cso.uiuc.edu!cs326ag From: cs326ag@ux1.cso.uiuc.edu (Loren J. Rittle) Newsgroups: comp.sys.amiga.programmer Subject: Compiler code (was a flame fest) Message-ID: <1991Apr2.100807.13471@ux1.cso.uiuc.edu> Date: 2 Apr 91 10:08:07 GMT Organization: University of Illinois at Urbana Lines: 133 In article <1991Apr2.002631.22799@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes: >In article mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes: >>>char buf[20]; >>>main() >>>{ >>> char *d=(char *)&buf; >>> const char *s="This is a test\n"; >>> while(*s) { *d++=*s++; } >>>} >>> >>>/* Test.s produced by gcc */ >>> >>>#NO_APP >>>gcc_compiled.: >>>.text >>>LC0: >>> .ascii "This is a test\12\0" >>> .even >>>.globl _main >>>_main: >>> lea _buf,a1 >>> lea LC0,a0 >>> tstb a0@ >>> jeq L5 >>>L4: >>> moveb a0@+,a1@+ >>> tstb a0@ >>> jne L4 >>>L5: >>> rts >>>.comm _buf,20 >> >> >>;/* test.s produced by me */ ; (cycles) >> lea text(pc),a0 ; 8 >> lea buf(pc),a1 ; 8 >BUG! This is not the same algorithm in the C code. Your code would move >a NULL byte if a NULL string was passed when it should do nothing. I don't know >why GCC doesn't generate PC relative instructions, but I think it has to do >with UNIX and perhaps the scatter load/memory partitioning. > >>.loop move.b (a0)+,(a1)+ ; 14*12 >> bne.s .loop ; 13*10+1*8 >> rts ; 16 >>text dc.b 'This is a test',10,0 >>buf ds.b 20 >> >>NO COMPARISON DUDE! GCC makes 3 totally wasted instructions, and one of >>them is inside your loop. Try an example with nested loops and your >>wasted clock cycles become a geometric progression. Multiply the kind >>of inefficiencies that GCC demonstrates here by EVERY loop and EVERY >>function you have and your program is slower and bigger than it needs >>to be. To be specific, the GCC routine is 6 words longer and by the >>time it is done executing, it will take 128 more clock cycles than >>mine will (on a 68000). Your routine takes 466 total clocks to execute, >>mine takes 338. I'm just your average 68000 assembler language programmer, >>but I saved 28% CPU time. You might also note that your 'C' source is >>7 lines of code and so is my assembler code. > > This code is compiled to run on a 68030 @ 50mhz, the difference in speed >would be _very_ small. Hey, someone compile this on SAS/C with ALL >optimizations on. You're just your average assembly language programmer >yet you introduced a subtle bug into the program that will stomp on the >first byte of a static global string by attempting to copy a null >string. In a HUGE program this bug would be so subtle that it may >take days to find out how a memory location is being sporatically >trashed. Worse yet, a null string will have garbage following it that could >span hundreds of bytes. As requested here is the SAS/C v5.10a generated code: Using: `lc -v -O -b0 test.c' `omd test.o' // Same program with GNU coding style enforced... char buf[20]; void main (void) { char *d = (char *)&buf; char *s = "This is a test\n"; // Note I got rid of the const, as // not only is it of questionable // legality under ANSI C (YOU CHANGE THE // VALUE OF s!), but also caused SAS/C // to generate (good, but) strange // (meaning LONG) code. while (*s) *d++=*s++; } Lattice AMIGA 68000-68020 OBJ Module Disassembler V5.04.039 Copyright ) 1988, 1989 Lattice Inc. All Rights Reserved. Amiga Object File Loader V1.00 68000 Instruction Set EXTERNAL DEFINITIONS _main 0000-00 _buf 0000-02 SECTION 00 "test.c" 00000020 BYTES | 0000 48E7 0030 MOVEM.L A2-A3,-(A7) | 0004 47F9 0000 0000-02 LEA 02.00000000,A3 | 000A 45F9 0000 0000-01 LEA 01.00000000,A2 | 0010 6002 BRA.B 0014 | 0012 16DA MOVE.B (A2)+,(A3)+ | 0014 4A12 TST.B (A2) | 0016 66FA BNE.B 0012 | 0018 4CDF 0C00 MOVEM.L (A7)+,A2-A3 | 001C 4E75 RTS SECTION 01 " " 00000010 BYTES 0000 54 68 69 73 20 69 73 20 61 20 74 65 73 74 0A 00 This is a test.. SECTION 02 " " 00000014 BYTES This compares quite nicely to GCC and the buggy asm code given above :-). >or two of magnitude improvement. Matt's assembler is perfect proof. If >Matt coded it in assembly and added some features it would blow away DevPac >and probably beat ArgAsm. I have to agree, Matt is a stud! Loren J. Rittle -- ``NewTek stated that the Toaster *would* *not* be made to directly support the Mac, at this point Sculley stormed out of the booth...'' --- A scene at the recent MacExpo. Gee, you wouldn't think that an Apple Exec would be so worried about one little Amiga device... Loren J. Rittle l-rittle@uiuc.edu