Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!umcp-cs!chris From: chris@umcp-cs.UUCP (Chris Torek) Newsgroups: net.lang.c Subject: Re: Re: structure alignment question Message-ID: <3527@umcp-cs.UUCP> Date: Sun, 21-Sep-86 16:11:24 EDT Article-I.D.: umcp-cs.3527 Posted: Sun Sep 21 16:11:24 1986 Date-Received: Sun, 21-Sep-86 23:38:54 EDT References: <101@hcx1.UUCP> <7363@sun.uucp> <696@mips.UUCP> <7447@sun.uucp> <1705@mcc-pp.UUCP> Reply-To: chris@umcp-cs.UUCP (Chris Torek) Organization: University of Maryland, Dept. of Computer Sci. Lines: 74 In article <1705@mcc-pp.UUCP> tiemann@mcc-pp.UUCP (Michael Tiemann) writes: >... The last 68000 compiler I used aligned strings on WORD boundaries. >This would cost one byte per string, half the time. But there was >a big speed payoff: I could do word operations in my strnlen, >strncmp, strncpy, and whatever other string processing functions >I happened to write. ... all this "fast" code actually runs slower >than a "dumb" byte-copy model [on a Sun-3], because the 68020 faults >itself to death reading in 32-bit words on odd boundaries, and >doesn't run at all on a Sun-2 because the 68010 can read odd words. (Does the 68020 really fault? I thought it just did two bus accesses.) It is not difficult to do copies in word mode iff the strings are aligned: | Sun mnenonics | /*LINTLIBRARY*/ | strcpy(to, from) char *to, *from; { *to = *from; return (to); } | /*UNTESTED!*/ ENTRY(strcpy) TO = a0 | I think this works FROM = a1 movl sp@(4),TO | to movl sp@(8),FROM | from | I forget if this is legal. If not, copy to d0 first. btst #0,TO | test for odd destination bnes odd0 | handle odd dst, unknown src btst #0,FROM | test for odd source bnes hardway | handle even dst, odd src | both addresses are even; do a fast strcpy fastcopy: movw FROM@+,d0 | grab entire word movw d0,d1 | need to test high byte first lsrw #8,d1 | throw out low byte beqs fastend | if high byte zero, go terminate dst movw d0,TO@+ | copy entire word tstb d0 | and see if we are now done bnes fastcopy | do more if not movl sp@(4),d0 | set return value rts | and return fastend: movql #0,d0 movb d0,TO@ | terminate destination string movl sp@(4),d0 | set return value rts | and return odd0: btst #0,FROM | test for odd source beqs hardway | handle odd dst, even src movb FROM@+,TO@+ | copy one byte to make even bnes fastcopy | and do rest with fast copy movl sp@(4),d0 | set return value rts | and return | one address is even, the other odd, so do it a byte at a time. hardway: movl TO,d0 | set return value hardloop: movb FROM@+,TO@+ | copy ... bnes hardloop | until we copy a null rts | return I wonder, though, if this is truly faster. Should not a movb/bnes pair run in loop mode? (Perhaps not; `dbcc' loops do, though, and one could use a dbra surrounded by a bit of extra logic.) Machine dependent `fast' code is often CPU dependent as well, and one must be prepared to modify marked inner loops when moving among implem- entations of one architecture. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu