# Convert float to int without FPU/SSE

Author: Wojciech Muła 2013-12-27

This short article shows how a normalized floating point value could be safely converted to an integer value without assistance of FPU/SSE. Only basic bit and arithmetic operations are used. In the worst case following operations are performed:

• 6 comparisons,
• 2 subtracts,
• 1 and,
• 1 or,
• 2 shifts (one with variable, one with constant amount).

The floating point value is calculated as: − 1sign ⋅ (1 + fraction) ⋅ 2exponentbias. The fraction part is in range [0, 1). For 32-bit values sign has 1 bit, exponent has 8 bits, fraction has 23 bits, and bias has value 127; exponent + bias is saved as an unsigned number.

The layout of binary word:

```+-+--------+-----------------------+
|S|exp+bias|        fraction       |
+-+--------+-----------------------+
31 30    23 22                     0
```

Let clear fields exponent + bias and sign and restore the implicit integer 1 at 24-th bit:

```+-+--------+-----------------------+
|0|00000001|xxxxxxxxxxxxxxxxxxxxxxx|
+-+--------+-----------------------+
31 30    23 22                     0
```

The value of such 32-bit word treated as an unsigned integer is (1 + fraction) ⋅ 223. To calculate the result this word have to be shifted left or right depending on value and sign of shift := exponent - 23; only few cases have to be considered:

• If shift is negative, then the word must be shifted right. The number of significant bits is 24, so if shift < − 24 the result is always zero.
• If shift is positive, then the word must be shifted left. Since destination is a 32-bit signed value, thus maximum shift is 31 - 24 = 7 bits --- when shift is greater than 7, then overflow will occur.
• If − 24 < shift < 7 then the number could be safely shifted. When shift = 7, then result has exactly 31 significant bits, thus a range check is required: for positive numbers (sign = 0) maximum value is 231 − 1 and for negative is 231.

Sample program is available.