Path: utzoo!attcan!uunet!lll-winken!ncis!helios.ee.lbl.gov!pasteur!ucbvax!agate!bionet!csd4.milw.wisc.edu!mailrus!ncar!gatech!hubcap!sestrich From: sestrich@Alliant.COM (Joe Sestrich) Newsgroups: comp.parallel Subject: Re: clever programming tricks. Message-ID: <4149@hubcap.UUCP> Date: 19 Jan 89 21:32:56 GMT Sender: fpst@hubcap.UUCP Lines: 59 Approved: parallel@hubcap.clemson.edu We've seen two algorithms for doing a population count: Use a neat trick to clear one bit at a time: A and (A-1) clears the bottom bit of A. This takes at least 4 instructions per loop-- 0 -> C LP: C + 1 -> C A - 1 -> T1 A and T1 -> T2 Jump to LP if T2 is not zero The number of cycles is dependant on the data, however, so assuming a uniform, independant probability of each bit being set, the time is [SUM i=1,32 (33-i)/2**i] * 4 = 31*4 + 1 (init cycle) = 125 The next idea is to divide the bits in half, and add them up until there is only one sum. This looks as follows: A AND 55555555 -> T1 A AND AAAAAAAA -> T2 Shift T2 right one bit -> T2 T1 + T2 -> A A AND 33333333 -> T1 A AND CCCCCCCC -> T2 Shift T2 right one bit -> T2 T1 + T2 -> A There are a total of FIVE similar copies of this for the complete algorithm (The original poster said 4). This is a common way of implementing the function in hardware, since the shifts and ands are only wires, and half of the adders need only be half adders. the total time is 5 * 4 = 20 units. Pretty good improvement! A third method is to group the bits, and use a table lookup to do the counting: 0 -> C LP: A AND M -> I T(I) + C -> C Shift A right by N bits -> A Jump to LP if A is not zero Where M is a mask with N bits set, and T is an array initialized as follows: 0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4 .... that is T(I) = the number of bits in I. If we pick n = 11, T has 2048 elements, and we have to go around the loop, at most 3 times, giving a time of 3*4 + 1 = 13. In fact, by unrolling the loop, we can get down to a time of 8. If indexing isn't free, the times would be 16 or 11.... Why the hell did I waste all this time on this silliness? It must be fun..... -- -- Joe Sestrich UUCP: ...!linus!alliant!sestrich Phone: (617) 486-1241 Mail: Alliant Computer Systems, One Monarch Drive, Littleton, MA 01460