Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!usc!samsung!rex!ames!amdcad!light!bvs From: bvs@light.uucp (Bakul Shah) Newsgroups: comp.arch Subject: Re: New instructions for RISCs (was Re: Byte ordering) Keywords: bitblt Message-ID: <1990Feb11.202919.2168@light.uucp> Date: 11 Feb 90 20:29:16 GMT References: <7345@pdn.paradyne.com> <168@zds-ux.UUCP> <7366@pdn.paradyne.com> <1990Feb10.154033.4271@mentor.com> Reply-To: bvs@light.UUCP (Bakul Shah) Organization: Bit Blocks, Inc. Lines: 58 In article <1990Feb10.154033.4271@mentor.com> franka@mntgfx.UUCP (Frank A. Adrian) writes: >In article <7366@pdn.paradyne.com> alan@oz.paradyne.com (Alan Lovejoy) writes: >>BMERGE Rd, Rs, Rm; Rd = (Rd & ~Rm) | (Rs & Rm) > >Instead of your proposed SPLICE instruction, I'd recommend a >BITEXT instruction which (using your notation, but where >[x:y] means bit indexing) looks like: > >BITEXT Rd, Rs, Rdescr; Rd = Rs[Rdescr[11:6] : Rdescr[5:0]] >> Rdescr[5:0] > >I believe the AMD 29K had a similar pair of instructions. BMERGE, BITEXT (bitfield extract), a bitfield extract with sign- extension and a bitfield insert instns can all be useful though the 29K does not have any of these. It does have `extract' that concatenates its two src operands into one 64 bit string, shifts it left by some amount (less than 32, stored in a special register) and sticks the *high* order bits in the dst reg. For example: A B 12345678 9abcdef0 |||///// 6789abcd C = ((A // B) << 12)[63:32] where // is the concatenate operator This is quite a versatile instn. By making either A or B all zeroes you can extract the top or bottom N bits. If A & B are the same register, you have a rotate by N instn. By using succesive registers in a sequence of extracts you can move an arbitrarily long bitstring. For example, to shift a 96 bit string left by 12 bits, you do the following: mtsr FC, 12 ; funnelshift count = 12 ; lr10 = 0, lr11,lr12,lr13 = the 96 bit string extract lr10, lr10, lr11 extract lr11, lr11, lr12 extract lr12, lr12, lr13 ; lr10//lr11//lr12 = ((lr10//lr11//lr12//lr13) << 12)[127:32] The last one is very handy for moving bitmaps around (large bitmaps can be moved at about 3 cycles / 32 bits, assuming memory can sustain single cycle access in page/static-column mode). But extract is not ideal for bitfield operations (extracting / inserting middle bits of a word will take about 4 instns). Though, I do think it is possible to do bitfield{extract,extract- wtih-sign-extend,insert} in single cycle -- the 29k already does something similar for bytes. >BTW, does anyone have an instruction format which uses the >source register specifiers as an immediate small constant >instead of a register specifier? The 29K does. -- Bakul Shah ..!{ames,sun,ucbvax,uunet}!amdcad!light!bvs