Path: utzoo!utgpu!watmath!iuvax!purdue!haven!mimsy!chris From: chris@mimsy.UUCP (Chris Torek) Newsgroups: comp.lang.c Subject: Re: faster bcopy using duffs device (source) Keywords: loop unrolling, optimize, hacks Message-ID: <19473@mimsy.UUCP> Date: 8 Sep 89 04:26:37 GMT References: <5180@portia.Stanford.EDU> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 15 In article <5180@portia.Stanford.EDU> stergios@Jessica.stanford.edu (stergios marinopoulos) writes: >I wanted a faster bcopy, so I used duffs device as a basis for it. bcopy() should be written in assembly (on most processors), put in a library, and forgotten about, because---for instance---a dbra loop beats a Duff loop on a 68010, every time. (And on a 68000, a loop using movml is best. 68020s have an I-cache, so a hand-coded `Duffish' loop is a good bet. Some VAXen have a special instruction which does a good job. [movc3 is done in software on the 610.] `rep movsb' [or is there a `movsw'?] is best on an 80x86. LDIR is best on a Z80. A Duff-style loop is probably best on a PDP-11.) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris