| Article: | Wojciech Muła | 
|---|---|
| Added on: | 2014-10-01 | 
| Updated on: | 2018-03-11 | 
"Limited" means a value where at most one bit is set. I.e. values are zero and all powers of two. Specification:
Naive implementation:
uint32_t fill_word_naive(uint32_t x) { return x ? 0xffffffff : 0x00000000; }
GCC 4.8.2 produces following code:
89 c2 mov %eax,%edx f7 da neg %edx 21 c2 and %eax,%edx 39 c2 cmp %eax,%edx 0f 94 c0 sete %al 0f b6 c0 movzbl %al,%eax
Not bad, but this can be done much simpler. We know that a value is zero or have exactly one bit set — first we have to copy this bit to the highest position, and then populate the MSB using arithmetic shift right.
Copying a bit can be done using single operation — arithmetic negation:
| x | -x | 
|---|---|
| 00000000 | 00000000 | 
| 00000001 | ffffffff | 
| 00000002 | fffffffe | 
| 00000004 | fffffffc | 
| 00000008 | fffffff8 | 
| 00000010 | fffffff0 | 
| 00000020 | ffffffe0 | 
| 00000040 | ffffffc0 | 
| 00000080 | ffffff80 | 
| 00000100 | ffffff00 | 
| 00000200 | fffffe00 | 
| 00000400 | fffffc00 | 
| 00000800 | fffff800 | 
| 00001000 | fffff000 | 
| 00002000 | ffffe000 | 
| 00004000 | ffffc000 | 
| 00008000 | ffff8000 | 
| 00010000 | ffff0000 | 
| 00020000 | fffe0000 | 
| 00040000 | fffc0000 | 
| 00080000 | fff80000 | 
| 00100000 | fff00000 | 
| 00200000 | ffe00000 | 
| 00400000 | ffc00000 | 
| 00800000 | ff800000 | 
| 01000000 | ff000000 | 
| 02000000 | fe000000 | 
| 04000000 | fc000000 | 
| 08000000 | f8000000 | 
| 10000000 | f0000000 | 
| 20000000 | e0000000 | 
| 40000000 | c0000000 | 
| 80000000 | 80000000 | 
Now the procedure could be saved as:
uint32_t fill_word_naive(uint32_t x) { return ((int32_t)(-x) >> 31); }
The compilation result:
f7 d8 neg %eax c1 f8 1f sar $0x1f,%eax
Just two simple instructions.
2018-03-11: GCC 7.2 still compiles fill_word_naive the same way, but clang 6.0 produces the final, two-instruction sequence.