# Count Bits with Value 1

From Software Engineers Wiki

Write a function that counts number of bits with bit value 1.

## Contents |

## Answer #1

int count_bit_1(unsigned int n) { int count; for (count = 0; n != 0; ++count) n &= n - 1; return count; }

## Answer #2

Use pre-calculated table. Using table per each 4 bits.

int count_bit_1_by_array_16(unsigned int n) { static int s_bit_count_16[] = { 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4 }; int count = 0; while (n) { count += s_bit_count_16[n & 0x0f]; n >>= 4; } return count; }

Or we can use table for 8 bits.

static int s_bit_count_256[256]; void count_bit_1_by_array_256_init(void) { unsigned int i; for (i = 0; i < 256; ++i) s_bit_count_256[i] = count_bit_1(i); } int count_bit_1_by_array_256(unsigned int n) { int count = 0; while (n) { count += s_bit_count_256[n & 0xff]; n >>= 8; } return count; }

For 0xffffffff iterations with incrementing samples,

- count_bit_1() : 81.1 seconds
- count_bit_1_by_array_16() : 28.2 seconds
- count_bit_1_by_array_256() : 14.1 seconds

It may benefit reducing branch, and make the code straight forward.

int count_bit_1_by_array_256(unsigned int n) { uint8_t *p8 = (uint8_t *) &n; int count; count = s_bit_count_256[*p8++]; count += s_bit_count_256[*p8++]; count += s_bit_count_256[*p8++]; count += s_bit_count_256[*p8]; return count; }

Under same condition, this took 7.7 seconds. GCC provides builtin function __builtin_popcount() for this purpose, but the function is not well optimized, and took about 30 seconds.

Latest x86 architecture provides POPCNT instruction for this purpose.

## Answer #3 : parallel counting

Parallel counting is pretty efficient algorithm without using a table.

int count_bit_1_parallel(unsigned int n) { n = n - ((n >> 1) & 0x55555555); n = (n & 0x33333333) + ((n >> 2) & 0x33333333); n = n + (n >> 4); n &= 0x0f0f0f0f; return (n * 0x01010101) >> 24; }