Xref: utzoo comp.lang.c:12373 comp.arch:6267 Path: utzoo!utgpu!water!watmath!clyde!att!rutgers!mailrus!cornell!uw-beaver!uw-june!rik From: rik@june.cs.washington.edu (Rik Littlefield) Newsgroups: comp.lang.c,comp.arch Subject: Re: Explanation, please! Summary: Non-aligned copies done efficiently with word ops. Message-ID: <5658@june.cs.washington.edu> Date: 6 Sep 88 19:03:38 GMT References: <5654@june.cs.washington.edu> Organization: U of Washington, Computer Science, Seattle Lines: 21 In article <5654@june.cs.washington.edu>, pardo@june.cs.washington.edu (David Keppel) writes: > > I can immagine that on some machines it is faster to copy words into > register and repack the words in the registers rather than do a byte > copy, since you could be taking advantage of some hardware gak. > On the old CDC 6000-series machines (early RISCs...) that was the *only* practical way to do it, as well as being blazingly fast. We had copies that would handle arbitrary *bit* alignments at a cost of around 6 instructions and 2 memory references per 60-bit word, in the middle of the string. The sequence was basically fetch, shift, mask, mask, OR, and store, appropriately rearranged to minimize memory delay and functional unit conflicts, of course. I vaguely remember that this thing could even be unrolled a couple of times and still fit in the instruction cache ("stack", in those days) for machines expensive enough to have one. VAXen I don't know about for sure, but I'd be real surprised if their microcode didn't do the same thing. --Rik