Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!zaphod.mps.ohio-state.edu!uwm.edu!csd4.csd.uwm.edu!chad From: chad@csd4.csd.uwm.edu (D. Chadwick Gibbons) Newsgroups: comp.lang.c Subject: memcpy versus assignment Message-ID: <1657@uwm.edu> Date: 26 Dec 89 22:41:51 GMT Sender: news@uwm.edu Reply-To: chad@csd4.csd.uwm.edu (D. Chadwick Gibbons) Distribution: na Organization: University of Wisconsin-Milwaukee Lines: 99 In several books I've seen that assignment of structures is usually more efficient than using memcpy(), at leant on most modern processors. I did a few experiments to see if this is true...using the following short program, I attempted to extract the machine code produced on different machines. struct bozo { int one; char two; long three; } foo, bar; main() { foo = bar; (void)memcpy((char *)&foo, (char *)&bar, sizeof(struct bozo)); } On an 8086 CPU, the compiler - MSC5.1 (yuck!) - produces the following code for the assignment when full optimization is on: ; foo = bar lea di, WORD PTR[bp-8] ; foo lea si, WORD PTR[bp-16] ; bar push ss pop es movsw ; the four movesw statements are more movsw ; space/speed efficient than a movsw ; mov cx,sizeof(foo)/2 movsw ; rep movsw combination.... On a VAX using gcc, the following code is produced: ; foo = bar; subl3 $76,fp,sp movab -64(fp),r1 movab -76(fp),r0 movl $12,r2 movblk The VAX naturally produces the more efficient code, but I would imagine the 8086 would do just as good of a job with larger structures, so that a mov cx, sizeof(struct bozo)/2 rep movsw could be used under appropriate circumstances. However, this is only have the question. Does the assignment win over memcpy? On the 8086, the following code is produced: ; (void)memcpy((char *)&foo, (char *)&bar, sizeof(struct foo)); lea ax, WORD PTR[bp-16] ; foo mov WORD PTR[bp-18], ax mov cx, 8 lea di, WORD PTR[bp-8] ; foo lea si, WORD PTR[bp-16] ; bar mov ax, ss shr cx, 1 rep movsw adc cx, cx rep movsb The compiler is smart enough to make memcpy an intrinsic function, so as to avoid a costly call statement. On the vax, a call to memcpy (or in this case bcopy(), which is the same thing) was produced, so I wasn't able to analyze the code directly. However, using gcc on bcopy.c produces the following code: .globl _bcopy _bcopy: .word 0x0 movl 4(fp),r4 movl 8(fp),r3 movl 12(fp),r2 tstl r2 jeql L1 cmpl r4,r3 jeql L1 L2: decl r2 tstl r2 jneq L2 L4: movl r3,r0 addl2 $4,r3 movl r4,r1 addl2 $4,r4 movl (r1),(r0) decl r2 tstl r2 jneq L4 ret Which, seems like quite a bit compared to the assignment. However, in almost all C code I have seen written, comments always state something along the lines of "/* use memcpy for structures larger than int */" which seems to go against the results shown above. In _general_ what is the rule for the assignment of two large structures? memcpy vs. assignment? Which is generally better?