Xref: utzoo comp.lang.c:12219 comp.arch:6203 Path: utzoo!utgpu!water!watmath!clyde!att!cbnews!lvc From: lvc@cbnews.ATT.COM (Lawrence V. Cipriani) Newsgroups: comp.lang.c,comp.arch Subject: Re: Explanation, please! Message-ID: <1002@cbnews.ATT.COM> Date: 30 Aug 88 12:56:52 GMT References: <653@paris.ICS.UCI.EDU> <2877@ttrdc.UUCP> Reply-To: lvc@cbnews.ATT.COM (Lawrence V. Cipriani) Organization: AT&T Bell Laboratories, Columbus Lines: 16 In article chuck@amdahl.uts.amdahl.com (Charles Simmons) writes: [discussion of Duff copy deleted] >I then added a piece to the program to use 'memcpy'. The results? >Duff beats a simple loop by 10%. 'memcpy' is 9 times faster than >Duff. So why do people spend so much time avoiding standard subroutines? Sometimes the standard subroutines are implemented horribly. I was horrified when I saw that the machine dependent version of memcpy on the AT&T 3Bs is nothing but a byte by byte transfer written in assembly language. It is tricky, but doable, to speed this up by a roughly a factor of sizeof(long). In fact it already is done in the 3B implementation of the UNIX(tm) operating system in the copyin (?) routine. Why wasn't it done in memcpy too? Sigh. -- Larry Cipriani, AT&T Network Systems, Columbus OH, cbnews!lvc lvc@cbnews.ATT.COM