Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watnot!watmath!clyde!rutgers!lll-lcc!ptsfa!hoptoad!gnu From: gnu@hoptoad.UUCP Newsgroups: comp.arch Subject: Re: String Processing Instruction -- AMD 29000 has *slow* byte access Message-ID: <1945@hoptoad.uucp> Date: Mon, 30-Mar-87 07:14:17 EST Article-I.D.: hoptoad.1945 Posted: Mon Mar 30 07:14:17 1987 Date-Received: Tue, 31-Mar-87 05:59:10 EST References: <15292@amdcad.UUCP> <1001@ames.UUCP> <15313@amdcad.UUCP> Organization: Nebula Consultants in San Francisco Lines: 39 In article <15313@amdcad.UUCP>, bcase@amdcad.UUCP (Brian Case) writes: > In article <1001@ames.UUCP> jaw@ames.UUCP (James A. Woods) writes: > > (a) significantly slow byte addressing to begin with (ala cray)? > >if (a), then improving memory byte access speed in the architecture is a > >more general solution with more payoff overall than the compare gate hack. > >what is the risc chip cost for byte vs. word addressibility, anyway? James Woods hit the nail on the head with this question. From the preliminary 29000 description, there are *no* byte instructions. This means that *cp turns into about 4 instructions: load a word, shift, mask, etc. Worse, *cp= turns into many more, since you have to load the target word, shift a mask to the byte of interest, mask out the old value, shift the new value, "or" it in, and store the word back. In other words, the designers of the 29000 did not think at all about typical Unix code like: register char *p, *q; while (*p++ = *q++) ; which takes on the order of 10 instructions PER BYTE. (I'd be interested in seeing the generated code for this program.) The 680x0 does it in 2 instructions, and even the dumb PCC compiler generates them. I helped to write some of the code to do rasterops on the Sun, and I remember what kind of code it takes to do bit-aligned copying on an instruction set that doesn't support bit fields. Character strings are to the 29000 what bit fields are to the 68010. First you see if the operands overlap, then...are they aligned, then... are they wider than a word, then...wider than two words?, then... You can make it fast, or you can make it simple...or maybe neither. I think the lack of byte stores, in particular, and byte addressing, in general, is the worst bug in the 29000. Then again, it's better than an 8088... -- Copyright 1987 John Gilmore; you can redistribute only if your recipients can. (This is an effort to bend Stargate to work with Usenet, not against it.) {sun,ptsfa,lll-crg,ihnp4,ucbvax}!hoptoad!gnu gnu@ingres.berkeley.edu