Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!rutgers!sri-unix!sri-spam!ames!oliveb!pyramid!prls!mips!earl From: earl@mips.UUCP Newsgroups: comp.arch Subject: Re: 32-bit CPUs ( NEC V70 ) and silly examples Message-ID: <407@gumby.UUCP> Date: Wed, 20-May-87 12:19:22 EDT Article-I.D.: gumby.407 Posted: Wed May 20 12:19:22 1987 Date-Received: Fri, 22-May-87 00:45:59 EDT References: <3810030@nucsrl.UUCP> <491@necis.UUCP> <3530@spool.WISC.EDU> <3962@cae780.TEK.COM> Distribution: na Lines: 25 Keywords: V60, V70, not so silly examples Summary: but how fast is the instruction? I agree with Scott Daniels and Ross Alexander that a->b->c and such are definitely not silly examples. I write such constructs frequently. But that does not necessarily mean it is a good idea to add an instruction to implement them. Perhaps someone with a data sheet can post the cycle count for these instructions so we can compare. An R2000 will do a load of or a store to a->b->c in 2 - 4 cycles depending on how well the load delays are scheduled (we typically schedule 75% of these so say 2.5 cycles). a->b->c->d in 3 - 6 (3.75). I'm assuming a is in a register, which with the MIPS compiler is a fairly safe assumption. The ability to schedule the load delays is an excellant reason NOT to provide such an addressing mode. If you implement the mode, you'll just find your microcode waiting all the time. If you generate separate instructions and let the compiler schedule them, then most of the time you won't wait at all. Note that I'm assuming that hardware can't take the output of the cache, do an add to get the new address, perhaps translate it, and feed it back to the cache in a single cycle. If it took a single cycle, I'd say the cycle time were artificially slow. The R2000 takes two cycles to do this, so loads have a delay of one cycle before the result is usable.