Using SSE to convert from hexadecimal ASCII to number

Author: Wojciech Muła
Added on:2014-10-22

Contents

Introduction

SSE procedure can convert 16- and 32-digits inputs producing 8- and 16-bytes results.

To get correct result's order of input, characters have to be reversed. In SSSE3 it can be done with pshufb, but in the earlier versions of SSE this is quite hard. When byte shuffling is not available then reversing can be done on the result word using bswap instructions.

Converting from ASCII to nibbles

Converting from ASCII to nibbles is reversion of the algorithm described in another text:

uint8_t ASCII2nibble(uint8_t ch) {

    const uint8_t result      = ch - '0';
    const uint8_t correction  = (ch > 'a') ? 'a' - 10 - '0' : 0;

    return result - correction;
}

In SSE the condition correction can be expressed as a compare & a bit-and:

t1 = pcmpgt(ch, packed_byte('a' - 1))
t2 = t1 & packed_byte('a' - 10 - '0')

Packing nibbles — reversed input

  1. input — 16 nibbles:

    t1 = packed_byte(| 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0a | 0b | 0c | 0d | 0e | 0f |)
    
  2. join nibbles:

    t2 = t1 | (t1 >> 4)
       = packed_byte(| 00 | 01 | 02 | 23 | 04 | 45 | 06 | 67 | 08 | 89 | 0a | ab | 0c | cd | 0e | ef |)
    
  3. mask higher byte of each word, as they contain garbage:

    t3 = t2 & packed_word(0x00ff)
       = packed_byte(| .. | 01 | .. | 23 | .. | 45 | .. | 67 | .. | 89 | .. | ab | .. | cd | .. | ef |)
    
  4. convert packed words to packed bytes — the lower 8 bytes of SSE register is the result word:

    t4 = packuswb(t3, whatever)
       = packed_qword(| 0123456789abcdef | ???????????????? |)
    

Packing nibbles — reversing result

  1. input — 16 nibbles:

    t1 = packed_byte(| 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0a | 0b | 0c | 0d | 0e | 0f |)
    
  2. join nibbles:

    t2 = (t1 << 12) | t1
       = packed_byte(| 10 | 01 | 32 | 03 | 54 | 05 | 76 | 07 | 98 | 09 | ba | 0b | dc | 0d | fe | 0f |)
    
  3. move higher bytes to lower, discarding lower bytes:

    t3 = t2 >> 8 // psrlw
       = packed_byte(| .. | 10 | .. | 32 | .. | 54 | .. | 76 | .. | 98 | .. | ba | .. | dc | .. | fe |)
    
  4. convert packed words to packed bytes — the lower 8 bytes of SSE register is the result word:

    t4 = packuswb(t3, whatever)
       = packed_qword(| 0123456789abcdef | ???????????????? |)
    

Converting 32-digit strings

This algorithm is able to convert 16 hex digits, but it can be easily extended to convert up to 32 hex digits. Steps 1 ... 3 have to be applied for lower & higher half of an input, and in step 4 arguments of packuswb are the outputs from 3rd step, say t3_lower and t3_higher:

t4 = packuswb(t3_lower, t3_higher)

Sample code

Sample implementation is available at github (file parse.sse2.c).