Path: utzoo!attcan!uunet!husc6!xait!g-rh From: g-rh@XAIT.Xerox.COM (Richard Harter) Newsgroups: comp.lang.c Subject: Re: Efficient Coding Practices Message-ID: <34196@XAIT.Xerox.COM> Date: 3 Oct 88 16:33:50 GMT References: <8809191521.AA17824@ucbvax.Berkeley.EDU> <68995@sun.uucp> <23025@amdcad.AMD.COM> <607@ardent.UUCP> <836@proxftl.UUCP> <34112@XAIT.XEROX.COM> <846@proxftl.UUCP> Reply-To: g-rh@XAIT.Xerox.COM (Richard Harter) Organization: Xerox Corporation, Cambridge, Massachusetts Lines: 76 In article <846@proxftl.UUCP> francis@proxftl.UUCP (Francis H. Yu) writes: >In article <34112@XAIT.XEROX.COM> g-rh@XAIT.Xerox.COM (Richard Harter) writes: >!Let's hand optimize this. Instead of using indices (presumably less >!efficent) we will use pointers. Since we don't want to destroy dst >!and src as pointers we need temporaries. Thus we might write >! tmp1 = dst; >! tmp2 = src; >! for (i=0;iThe better code is > tmp1 = dst; > tmp2 = src; > tmp3 = dst + n; > while (tmp1 != tmp3) { > *tmp1++ = *tmp2++; > } The point of this little exercise was to illustrate that using pointers quite often means extra lines of code and temporaries. However your "better" code is not necessarily better. In many machines it is more efficient to control loops by counting a register down to zero and escaping on zero than it is to exit on a compare. A good compiler will do exactly that with the sample code. If we are hand optimizing, we write register int i; ... tmp1 = dst; tmp2 = src; for (i=n;i;--i) *tmp1++ = *tmp++; Your while loop, in pseudo machine language, runs something like this mv dst r1 # load register 1 with dst mv src r2 # load register 2 with src mv *n,r3 # move n to register 3 ble a2 # n already done, skip loop add r1,r3 # add n to get termination br a1 # loop test at bottom, go there a0: mv *r2++ *r1++ # Move src to dst, incrementing a1: compare r1,r3 # compare term with with dst ptr bne a0 # not eq, go to loop start a2: The corresponding code for a countdown loop is mv dst r1 # load register 1 with dst mv src r2 # load register 2 with src mv n r3 # load register 3 with n ble a2 # n already done, skip loop br a1 # loop test at bottom, go there a0: mv *r2++, *r1++ # Move src to dst, incrementing a1: dec r3 # Decrement count down index bgt a0 # index not negative, do more a2: The loop bodies are the same except that the compare is replaced with a decrement, which may be faster (it is on most machines) and is never slower (as far as I know). Moreover many machines have a decrement and test instruction. For example, the PDP-11 has an SOB instruction which combines the two. Note: It is more efficient to put the loop test at the bottom of the loop and branch there to initiate the loop; it saves a branch instruction. Lesson: If efficiency is a concern, countdown loops are faster than compare value loops. Lesson: If one is optimizing code one has to think about what the machine will have to do in different implementations when comparing them. -- In the fields of Hell where the grass grows high Are the graves of dreams allowed to die. Richard Harter, SMDS Inc.