Xref: utzoo comp.lang.c:12168 comp.arch:6182 Path: utzoo!attcan!uunet!convex!killer!ames!amdahl!pyramid!cbmvax!snark!eric From: eric@snark.UUCP (Eric S. Raymond) Newsgroups: comp.lang.c,comp.arch Subject: Re: Explanation, please! Summary: Oh *ghod*, wotta beautiful ugly hack! Message-ID: Date: 26 Aug 88 17:30:20 GMT References: <638@paris.ics.uci.edu> Organization: Smash-the-State Leather and Lingerie Boutique Lines: 43 (Code below reproduced so that comp.arch people seeing this followup only won't get terminally frustrated. This is *really neat*, gang...) In article <638@paris.ics.uci.edu> Douglas C. Schmidt writes: > > void send(int *to,int *from, int count) { > int n = (count + 7) / 8; > > switch(count % 8) { > case 0: do { *to++ = *from++; > case 7: *to++ = *from++; > case 6: *to++ = *from++; > case 5: *to++ = *from++; > case 4: *to++ = *from++; > case 3: *to++ = *from++; > case 2: *to++ = *from++; > case 1: *to++ = *from++; > } while (--n > 0); > } > > } > > Finally, Stroustrup asks the rhetorical question ``why would anyone > want to write something like this.'' Any guesses?! Yeah. That's the most hackish way of trying to write a portable optimized copy routine I've ever seen. I gather the whole point of the shenanigans is to get all the *from++ -> *to++ instructions in the generated code to be adjacent. This only makes if the author knows he's got a hardware instruction pipeline or cache that's no less than 8 and no more than 9 byte-copy instruction widths long, and stuff executing out of the pipeline is a lot faster than if the copies are interleaved with control transfers. Dollars to doughnuts this code was written on a RISC machine. (Gawrsh. That sounded just like one of the big boys on comp.arch tawkin'. I think I'll cross-post over there just to see if I get shot down in flames...) -- Eric S. Raymond (the mad mastermind of TMN-Netnews) UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP Post: 22 S. Warren Avenue, Malvern, PA 19355 Phone: (215)-296-5718