Small win over compiler

Author:	Wojciech Muła
Added on:	2014-09-30

There are some places where a low-level programmer can beat a compiler. Consider this simple code:

#include <stdint.h>

uint32_t bsr(uint32_t x) {
    // xor, because this builtin returns 31 - bsr(x)
    return __builtin_clz(x) ^ 31;
}

uint32_t min1(uint32_t x) {
    if (x != 0) {
        return bsr(x) + 1;
    } else {
        return 1;
    }
}

Function min1 is compiled to (GCC 4.8 with flag -O3):

min1:
    movl    4(%esp), %edx
    movl    $1, %eax
    testl   %edx, %edx
    je  .L3
    bsrl    %edx, %eax
    addl    $1, %eax
.L3:
    rep ret

There is a conditional jump, not very good. When we rewrite the function:

uint32_t min2(uint32_t x) {
    return bsr(x | 1) + 1;
}

Result is this nice branchless code:

min2:
    movl    4(%esp), %eax
    orl $1, %eax
    bsrl    %eax, %eax
    addl    $1, %eax
    ret

Conclusion: it's worth to check a compiler output. Sometimes.