Path: utzoo!utgpu!water!watmath!clyde!att-cb!att-ih!gargoyle!oddjob!mimsy!chris From: chris@mimsy.UUCP (Chris Torek) Newsgroups: comp.lang.c Subject: Re: strcpy Keywords: vax-specific Message-ID: <10903@mimsy.UUCP> Date: 2 Apr 88 20:52:33 GMT References: <7712@apple.Apple.Com> <7485@brl-smoke.ARPA> <10731@mimsy.UUCP> <848@cresswell.quintus.UUCP> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 75 >In article <10895@mimsy.UUCP> I mentioned that >>... The 4.3BSD Vax strcpy() uses the Vax locc and movc3 instructions. In article <848@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >This may not be such a wonderful idea: according to the DEC manuals, >some VAX models do not implement the locc instruction. (The machine >will trap to some sort of library which emulates the missing instructions.) This is true. In particular, the Microvax I and II chips do not. (Indeed, the uVax I does not even implement movc3 in hardware.) The II traps to kernel code that emulates locc. (And people wonder why strcpy() and index() are slow there! I argued for a `getcputype' syscall just for library optimisation, but no one has done it.) >Getting this right for strings longer than 2^16-a few characters must be >a nightmare: both locc and movc3 have a 16-bit length operand. (This >has never made sense to me.) (Since VMS string descriptor lengths are Words rather than Longwords, obviously no one would ever want strings longer than that. Right.) Actually, it is not that bad; in particular, movc3 leaves registers r1 and r3 pointing to the `next' string, so that you wind up with something like this: # strcpy(dst, src) ... loop: /* src in r1, dst in r3 */ locc $0,$65535,src # find the \0 in src beql last_block # if we found it, finish up movc3 $65535,src,dst # otherwise move 64K brb loop # and keep going last_block: /* convert to a count and move <65535 bytes */ The code for bcopy/memcpy/memmove that handles overlapping `backwards' moves, however, is perhaps best described as `amusing': /* length in r6, src in r1, dst in r3 */ addl2 r6,r1 # jump to end of block addl2 r6,r3 movzwl $65535,r0 # get a handy 64K brb 5f 4: subl2 r0,r6 # count 64K moved /* here begins the silliness: note how r1 and r3 need adjustment now */ subl2 r0,r1 # ... from 64K behind where we were subl2 r0,r3 movc3 r0,(r1),(r3) # the VAX does this back to front movzwl $65535,r0 # but we still have to fix the pointers /* ... and again! */ subl2 r0,r1 # afterward subl2 r0,r3 5: cmpl r6,r0 # 64K? bgtr 4b # more subl2 r6,r1 # 64K or less; subl2 r6,r3 # adjust the pointers movc3 r6,(r1),(r3) # and move movl 4(ap),r0 # always return dst ret In other words, even though the microcode decides to move the string `back to front' (high addresses to low addresses), and therefore sets the registers to count down from the top, it very carefully adjusts them afterward so that they point to the high addresses---exactly what we do NOT want. (I suspect the high bits of one of the counting registers are used to flag the direction, which would give another reason why the lengths are limited. Too bad they are not limited to 30 bits, which is as much as you can address in one segment [no, not iNTEL segments].) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris