Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site calgary.UUCP Path: utzoo!watmath!clyde!cbosgd!ihnp4!alberta!calgary!radford From: radford@calgary.UUCP (Radford Neal) Newsgroups: net.lang.c Subject: Unrolling string copy loop Message-ID: <341@calgary.UUCP> Date: Mon, 1-Apr-85 16:33:46 EST Article-I.D.: calgary.341 Posted: Mon Apr 1 16:33:46 1985 Date-Received: Fri, 5-Apr-85 02:08:11 EST References: <1049@gloria.UUCP> <3505@alice.UUCP> Organization: University of Calgary, Calgary, Alberta Lines: 55 > sym.1: > movb (r2)+,(r1)+ > bneq sym.1 > By the way, Colonel, this loop is not improved by unrolling. WRONG! I timed the following two routines: # String copy with ordinary loop. _sc1: .word 0 movl 4(ap),r1 movl 8(ap),r2 1: movb (r1)+,(r2)+ bneq 1b ret # String copy with unrolled loop. _sc2: .word 0 movl 4(ap),r1 movl 8(ap),r2 1: movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ bneq 1b 2: ret The first takes 120 microseconds to copy a thirty character string. The second takes only 100 microseconds. Seems that branches not taken are faster than branches which are taken. Radford Neal The University of Calgary