Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!ames!nrl-cmf!cmcl2!brl-adm!umd5!uvaarpa!mcnc!decvax!decwrl!hplabs!hpda!hpcupt1!viggy From: viggy@hpcupt1.HP.COM (Viggy Mokkarala) Newsgroups: comp.arch Subject: Re: Re: Auto-shifted registers (and ease of compiler writing) Message-ID: <6310007@hpcupt1.HP.COM> Date: 1 Mar 88 17:47:18 GMT References: <1390@vaxb.calgary.UUCP> Organization: Hewlett Packard, Cupertino Lines: 42 radford@calgary.UUCP (Radford Neal) writes: >In article <6310005@hpcupt1.HP.COM>, viggy@hpcupt1.HP.COM (Viggy Mokkarala) writes: >> The HP Precision Architecture provides for these kinds of operations by its >> Shift and Add instructions. There is a pre-shifter before one of the inputs to >> the CPU. It allows for one of the operands to be pre-shifted by upto 3 bits >> before an addition happens. >Does this cost you anything (other than chip area), or does the shift >overlap another operation? If it does overlap, is there any reason >not to allow it for all relevant instructions (e.g. and and or)? >Is there any reason to keep the unshifted add (given a shift of zero >is possible)? > Radford Neal Sorry about the previous failed posting. The pre-shifter before one of the CPU inputs is the same shifter that is used for the "indexed loads" of the HP Precision Arch. The index register (which can be any of the general registers), may be optionally shifted left by 1, 2, or 3 bits so that integer addressing to half words, words, or double words is possible. Therefore, it turns out that the "shift" operation in the shift and add instructions, which were included as primitives for integer multiplication, comes for free. Instructions such as Shift and And,or Shift and Or are not frequently used instructions, and weren't considered. There isn't a Shift Zero and Add instruction (it is simplly called the Add instruction, to keep the instruction names simple I guess :-) ). In the HP 9000/840 (the first HPPA product - off the shelf TTL parts), the pre-shift operation takes place in the beginning of the execute phase (the pipe is 3 stages deep, and the execute phase is the second one). The pre-shift operation takes 6-7 nsec and happens in parallel with the immediate generation. All register operands do pass through the pre-shifter and only some indexed loads, and the shift and add instructions get shifted by more than zero. In this implementation, it required about 8 or 9 packs extra to do this. Viggy Mokkarala, Hewlett Packard Co., Cupertino, CA. (hpda!viggy)