Article: | Wojciech Muła |
---|---|
Added on: | 2014-10-01 |
Updated on: | 2018-03-11 |
"Limited" means a value where at most one bit is set. I.e. values are zero and all powers of two. Specification:
Naive implementation:
uint32_t fill_word_naive(uint32_t x) { return x ? 0xffffffff : 0x00000000; }
GCC 4.8.2 produces following code:
89 c2 mov %eax,%edx f7 da neg %edx 21 c2 and %eax,%edx 39 c2 cmp %eax,%edx 0f 94 c0 sete %al 0f b6 c0 movzbl %al,%eax
Not bad, but this can be done much simplier. We know that a value is zero or have exactly one bit set — first we have to copy this bit to the highest position, and then populate the MSB using arithmetic shift right.
Copying a bit can be done using single operation — arithmetic negation:
x | -x |
---|---|
00000000 | 00000000 |
00000001 | ffffffff |
00000002 | fffffffe |
00000004 | fffffffc |
00000008 | fffffff8 |
00000010 | fffffff0 |
00000020 | ffffffe0 |
00000040 | ffffffc0 |
00000080 | ffffff80 |
00000100 | ffffff00 |
00000200 | fffffe00 |
00000400 | fffffc00 |
00000800 | fffff800 |
00001000 | fffff000 |
00002000 | ffffe000 |
00004000 | ffffc000 |
00008000 | ffff8000 |
00010000 | ffff0000 |
00020000 | fffe0000 |
00040000 | fffc0000 |
00080000 | fff80000 |
00100000 | fff00000 |
00200000 | ffe00000 |
00400000 | ffc00000 |
00800000 | ff800000 |
01000000 | ff000000 |
02000000 | fe000000 |
04000000 | fc000000 |
08000000 | f8000000 |
10000000 | f0000000 |
20000000 | e0000000 |
40000000 | c0000000 |
80000000 | 80000000 |
Now the procedure could be saved as:
uint32_t fill_word_naive(uint32_t x) { return ((int32_t)(-x) >> 31); }
The compilation result:
f7 d8 neg %eax c1 f8 1f sar $0x1f,%eax
Just two simple instructions.
2018-03-11: GCC 7.2 still compiles fill_word_naive the same way, but clang 6.0 produces the final, two-instruction sequence.