Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!rutgers!ames!oliveb!pyramid!prls!mips!hansen From: hansen@mips.UUCP Newsgroups: comp.arch Subject: Re: AM29000 Booleans Message-ID: <381@dumbo.UUCP> Date: Mon, 11-May-87 17:07:37 EDT Article-I.D.: dumbo.381 Posted: Mon May 11 17:07:37 1987 Date-Received: Thu, 14-May-87 01:32:09 EDT References: <1270@aw.sei.cmu.edu> <8012@utzoo.UUCP> <16640@amdcad.AMD.COM> Lines: 82 Summary: MIPS does it in 4 In article <16640@amdcad.AMD.COM>, tim@amdcad.AMD.COM (Tim Olson) writes: > int x; > > main() > { > > x = x*57; > } > > Compiles to: > for a multiply time of 6 cycles. The MIPS compiler system does the same operation in 4 cycles, by performing subtracts as well as adds. The same code above compiles (unoptimized) to: main: 0x0: lw t6,0(gp) 0x4: nop 0x8: sll t7,t6,3 0xc: subu t7,t7,t6 0x10: sll t7,t7,3 0x14: addu t7,t7,t6 0x18: jr ra 0x1c: sw t7,0(gp) The code above uses the fact that 57 = (8-1)*8 + 1. For the other AMD example, the MIPS compiler system can do all the branches directly, without the compares, and places the single-statement paths directly into the branch delay slots as was demonstated in the hand-tweaked version. In fact, I had to change the code to generate two different results and combine them, because otherwise the optimizer would remove the first set of statements (they generate a dead value). The code vanishes entirely if no value is returned by the routine. The source code: int bool(x) int x; { int y, z; if (x) y=0; else y=1; if (!x) z=0; else z=1; return(y+z); } Compiles to: bool: 0x0: beq a0,zero,0x14 0x4: li v1,1 0x8: b 0x14 0xc: move v1,zero 0x10: li v1,1 0x14: bne a0,zero,0x28 0x18: li a0,1 0x1c: b 0x28 0x20: move a0,zero 0x24: li a0,1 0x28: jr ra 0x2c: addu v0,v1,a0 It turns out that the movement of some code into branch delay slots occurs as a peephole optimization after the code has been reorganized once, so the same instruction appears at address 0x4 and 0x10, though the instruction at address 0x10 is never used. (The same thing occurs at 0x18 and 0x24.) If a post-pass was employed to clean up the code, the unconditional branches could also be removed (addresses 0x8 and 0x1c). We don't bother because, in practice, single-instruction branch paths are rare, and the unused instructions don't cause any serious problems. -- Craig Hansen Manager, Architecture Development MIPS Computer Systems, Inc. ...decwrl!mips!hansen