Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watnot!watmath!clyde!rutgers!ames!jaw From: jaw@ames.UUCP Newsgroups: comp.arch,comp.lang.c Subject: Re: String Processing Instruction Message-ID: <1001@ames.UUCP> Date: Thu, 26-Mar-87 21:06:22 EST Article-I.D.: ames.1001 Posted: Thu Mar 26 21:06:22 1987 Date-Received: Sat, 28-Mar-87 07:17:26 EST References: <15292@amdcad.UUCP> Organization: NASA Ames Research Center, Moffett Field, CA Lines: 32 Xref: utgpu comp.arch:671 comp.lang.c:1345 > [....] Since > most C programs (especially utilities and other systems programs) do a lot of > string processing, this one instruction is really worth the small > implementation cost. It often improves run times by 15% to 20% (just goes to > show that the impact of processing C language strings has been long- ignored). just curious which unix utilities use str(cpy|cmp|len) in their inner loops? certainly, 'vn' comes to mind as devoting much cpu time to these functions. is the 15-20% claimed due at all to either (a) significantly slow byte addressing to begin with (ala cray)? (b) in-line compilation of the string(3) stuff into the application? if (a), then improving memory byte access speed in the architecture is a more general solution with more payoff overall than the compare gate hack. what is the risc chip cost for byte vs. word addressibility, anyway? if (b), then maybe function call speed is the culprit rather than dearth of the specialized instruction. at any rate, for cray unix, buffer copy ops in the kernel were vastly improved when re-written for words instead of bytes, even more so when vectorized (the only place in the kernel with vectorization, i think). of course, table lookup using only 2^16 locations would be a joke software solution for super-intensive null-char-in-16-bit-smallword compare code. drastic, but saves a test the amd chip appears worried about. personally, i'm a fan of branch folding ... ames!jaw