Xref: utzoo comp.lang.c:12382 comp.arch:6273 Path: utzoo!yunexus!geac!syntron!jtsv16!uunet!seismo!sundc!pitstop!sun!decwrl!labrea!rutgers!mit-eddie!uw-beaver!uw-june!pardo From: pardo@june.cs.washington.edu (David Keppel) Newsgroups: comp.lang.c,comp.arch Subject: Re: Explanation, please! Message-ID: <5654@june.cs.washington.edu> Date: 6 Sep 88 17:25:50 GMT Article-I.D.: june.5654 References: <638@paris.ics.uci.edu> <566@pcrat.UUCP> <9087@pur-ee.UUCP> Reply-To: pardo@uw-june.UUCP (David Keppel) Organization: U of Washington, Computer Science, Seattle Lines: 34 hankd@pur-ee.UUCP (Hank Dietz) writes: > if ((p - q) & 3) *byte copy* else *struct copy* I believe that the VAX "movc" command takes arbitrary pointers and does the following: * If both are word-aligned, do a word copy (I mean a 4-byte word). * If both are non-aligned and could be aligned with 1, 2, or 3 bytes of byte-copy at either end, then do a byte copy at either end and do a word copy down the middle. * If niether aligned then ?? Unfortunately, my VAX hardware reference is out of town for a couple of weeks, so I can't ask him about neither aligned. Anybody know? I can immagine that on some machines it is faster to copy words into register and repack the words in the registers rather than do a byte copy, since you could be taking advantage of some hardware gak. Simple example: machine X has register W1 divided into B4, B5, B6, B7. To do a copy, align the source pointer (doing byte copies) then read a wrod-at-a-time into the W1 register, write it back out by writing B4, B5, B6, B7 (little-endian). This is beginning to look suspiciously like the kinds of optimizations that get done for bit BLTs. Anybody know if this ever really gets done? ;-D on ( Ahh. Architecture at its finest ) Pardo -- pardo@cs.washington.edu {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo