Path: utzoo!attcan!uunet!husc6!uwvax!oddjob!ncar!ames!mailrus!cornell!uw-beaver!teknowledge-vaxc!sri-unix!garth!smryan From: smryan@garth.UUCP (Steven Ryan) Newsgroups: comp.arch Subject: Re: getting rid of branches Message-ID: <853@garth.UUCP> Date: 1 Jul 88 21:01:12 GMT References: <1941@pt.cs.cmu.edu> <3208@ubc-cs.UUCP> <1986@pt.cs.cmu.edu> <91odrecXKL1010YEOek@amdahl.uts.amdahl.com> <12258@mimsy.UUCP> Reply-To: smryan@garth.UUCP (Steven Ryan) Organization: INTERGRAPH (APD) -- Palo Alto, CA Lines: 22 > How do you move old >code (`dusty decks') onto a parallel processor? One way is to slice up >the program into independent pieces that can be combined again later. > . . . >you run two or three loops: > > 1 2 3 > > for i in [0..n-1) for i in [0..n-1) for i in [0..n-1) > A := compute A := compute A := compute > B[i] := fn1(A) B[i] := fn2(a) combine[i] := > test(A) > rof rof rof On a 205, if compute, test, and fn1 and fn2 are vectorisable, this entire construct can be hand-vectorised by something like compute' -> promoted A' test(A') -> bit vector fn1'(A') -> B[i] if bit[i] set (else discard and ignore faults) fn2'(A') -> B[i] if bit[i] clear (else discard and ignore faults) CFT is supposed to provide a vectorisable conditional expression. FTN200 code to vectorise ifs may/may not be completed and released.