Mask for zero/non-zero bytes

Author:Wojciech Muła
Added on:2014-03-09

The description of Determine if a word has a zero byte from "Bit Twiddling Hacks" says about haszero(v): "the result is the high bits set where the bytes in v were zero".

Unfortunately this is not true. The high bits are also set for ones followed zeros, i.e. haszero(0xff010100) = 0x00808080. Of course the result is still valid (non-zero if there were any zero byte), but if we want to iterate over all zeros or find the last zero index, this could be a problem.

It's possible to create an exact mask:

uint32_t nonzeromask(const uint32_t v) {
        // MSB are set if any of 7 lowest bits are set
        const uint32_t nonzero_7bit = (v & 0x7f7f7f7f) + 0x7f7f7f7f;

        return (v | nonzero_7bit) & 0x80808080;
}

uint32_t zeromask(const uint32_t v) {
         // negate MSBs
         return (nonzeromask(v)) ^ 0x80808080;
}

Function nonzeromask requires four simple instructions, and zeromask one additional xor.