Conditionally fill word (for limited set of input values)

Article:	Wojciech Muła
Added on:	2014-10-01
Updated on:	2018-03-11

"Limited" means a value where at most one bit is set. I.e. values are zero and all powers of two. Specification:

if x is zero then result is also zero,
if x is power of two result is word full of zeros.

Naive implementation:

uint32_t fill_word_naive(uint32_t x) {

    return x ? 0xffffffff : 0x00000000;
}

GCC 4.8.2 produces following code:

89 c2        mov    %eax,%edx
f7 da        neg    %edx
21 c2        and    %eax,%edx
39 c2        cmp    %eax,%edx
0f 94 c0     sete   %al
0f b6 c0     movzbl %al,%eax

Not bad, but this can be done much simplier. We know that a value is zero or have exactly one bit set — first we have to copy this bit to the highest position, and then populate the MSB using arithmetic shift right.

Copying a bit can be done using single operation — arithmetic negation:

x	-x
`00000000`	`00000000`
`00000001`	`ffffffff`
`00000002`	`fffffffe`
`00000004`	`fffffffc`
`00000008`	`fffffff8`
`00000010`	`fffffff0`
`00000020`	`ffffffe0`
`00000040`	`ffffffc0`
`00000080`	`ffffff80`
`00000100`	`ffffff00`
`00000200`	`fffffe00`
`00000400`	`fffffc00`
`00000800`	`fffff800`
`00001000`	`fffff000`
`00002000`	`ffffe000`
`00004000`	`ffffc000`
`00008000`	`ffff8000`
`00010000`	`ffff0000`
`00020000`	`fffe0000`
`00040000`	`fffc0000`
`00080000`	`fff80000`
`00100000`	`fff00000`
`00200000`	`ffe00000`
`00400000`	`ffc00000`
`00800000`	`ff800000`
`01000000`	`ff000000`
`02000000`	`fe000000`
`04000000`	`fc000000`
`08000000`	`f8000000`
`10000000`	`f0000000`
`20000000`	`e0000000`
`40000000`	`c0000000`
`80000000`	`80000000`

Now the procedure could be saved as:

uint32_t fill_word_naive(uint32_t x) {

    return ((int32_t)(-x) >> 31);
}

The compilation result:

f7 d8        neg    %eax
c1 f8 1f     sar    $0x1f,%eax

Just two simple instructions.

2018-03-11: GCC 7.2 still compiles fill_word_naive the same way, but clang 6.0 produces the final, two-instruction sequence.