Path: utzoo!utgpu!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!rutgers!deimos!uxc!uxc.cso.uiuc.edu!uxg.cso.uiuc.edu!uxe.cso.uiuc.edu!mcdonald
From: mcdonald@uxe.cso.uiuc.edu
Newsgroups: comp.unix.questions
Subject: Re: bcopy
Message-ID: <47800018@uxe.cso.uiuc.edu>
Date: 12 Dec 88 14:52:00 GMT
References: <669@auspex.UUCP>
Lines: 23
Nf-ID: #R:auspex.UUCP:669:uxe.cso.uiuc.edu:47800018:000:1272
Nf-From: uxe.cso.uiuc.edu!mcdonald    Dec 12 08:52:00 1988


>don't let the twerps responsible for the memcpy/memmove debacle blind
>implementers. memcpy and memmove should be the same entry point;
>the test for overlapping regions is only a few instructions and just doesn't
>matter in any practical timing sense. If the bytecount is at all significant,
>initial overhead is irrelevent and if teh bytecount is small, then the
>subroutine call overhead is probably 2-3 times more expensive than the
>check.
>	in any case, in a lot of hardware, overlapping doesn't matter;
>what matters is left to right or right to left.
>	does anyone know of any case where the above analysis fails?

memmove and memcpy should NOT be the same entry point. On most
computers they shouldn't HAVE entry points - they should be inline.
On sufficiently CISC cpu's they should be single instructions
if the arguments are constants. If not, they should still be very small.
It seems to me the idea is that memcpy should be the most efficient
possible way to copy SMALL things - maybe even things like structs
containing four bytes, in which case a 32 bit integer move instruction
could be generated (if the alignemnt was right). When ANSI becomes
standard, there will be standard benchmarks that cruelly penalize
compilers with slow memcpy and memmoves.