Using PEXT to convert from binary ASCII to number

Author:	Wojciech Muła
Added on:	2014-10-06

Suppose we have a string containing ASCII zeros and ones, for example "11100100", and we want to interpret this text as a binary number and get value (0xe4).

New instruction PEXT from BMI2 (Binary Manipulation Instructions) is perfect for this task. PEXT — parallel extract — forms a word from source bits selected by a mask, for example (32-bit arguments):

         MSB                               LSB
         ┌────────┬────────┬────────┬────────┐
src    = │00101010│11101101│00011011│11110000│
         └────────┴────────┴────────┴────────┘
         ┌────────┬────────┬────────┬────────┐
mask   = │10000011│10000001│11110000│00000111│
         └────────┴────────┴────────┴────────┘

         ┌────────┬────────┬────────┬────────┐
result = │00000000│00000000│00000101│10001000│
         └────────┴────────┴────────┴────────┘

This is exactly what the conversion needs — since the code of ASCII '0' is 0x30 and '1' is 0x31 we need to extract the lowest bit of each byte (of course if we're sure that input is valid).

Example string "11100100" is encoded as 0x3131313030313030:

src  = 0x3131313030313030 // 64-bit word
mask = 0x0101010101010101 // 64-bit word

result = pext(src, mask)

The value of result is 0xe4 = 0b11100100.

Working example is available at github (see parse_string.c).