What Is Fixed-Point Math?

Fixed-point math is a way to represent fractional values using integers. This is done by selecting a constant scaling factor that is implicitly applied to every value. The scaling factor defines a step size, and the integer value defines a number of steps. The number of steps is usually relative to zero. Fixed-point math is commonly used for audio and video signal processing where each sample is represented by an integer, either signed or unsigned, with a fixed number of bits. In many cases, the scaling factor is selected so the fractional range is -1.0 – 1.0 or 0 – 1.0, but other scaling factors may be used depending on the application. The following table gives some examples:

Type	Scaling Factor	Range
8-bit unsigned	1/256	0.0 – 0.99609375
16-bit unsigned	1/256	0.0 – 255.99609375
16-bit unsigned	1/65536	0.0 – 0.9999847412109375
16-bit signed	1/32768	-1.0 – 0.999969482421875

In computer programming, powers of 2 are commonly used as scaling factors because they can be applied by bit shifting. This is required when multiplying and dividing fixed-point numbers to restore the proper scaling factor after the operation. The following equations illustrate multiplication and division using fixed-point math:

The actual value represented is an integer multiplied by the scaling factor.

$x_f = x_i \cdot S$

For multiplication, the correct fractional result is obtained by multiplying the integer result by the scaling factor.

$a_f \cdot b_f = c_f = (a_i \cdot S) \cdot (b_i \cdot S)= (a_i \cdot b_i ) \cdot S^{2}$

$c_f = c_i \cdot S \to c_i = (a_i \cdot b_i) \cdot S$

For division, the correct fractional result is obtained by dividing the integer result by the scaling factor.

$\frac{a_f}{b_f} = c_f =\frac{a_i \cdot S}{b_i \cdot S} = \frac{a_i}{b_i}$

$c_f = c_i \cdot S \to c_i = \frac{a_i}{b_i \cdot S}$

In code, these operations require extra precision in the intermediate values. The following code example demonstrates fixed-point multiplication and division in C:

fixed_point.c

#include <stdio.h>

#include <stdint.h>

/* Signed 16-bit integer range -1.0 – 1.0 */

#define SCALE_SHIFT 15

#define SCALE_FACTOR (1.0 / (double)(1 << SCALE_SHIFT))

int main()

{

int16_t a, b, c;

int32_t temp;

a = 1234; /* 0.03765869140625 */

b = 8765; /* 0.267486572265625 */

temp = a * b;

c = (int16_t)(temp >> SCALE_SHIFT);

printf(“%d * %d = %d\n”, a, b, c);

printf(“%f * %f = %f\n”, a * SCALE_FACTOR, b * SCALE_FACTOR,

c * SCALE_FACTOR);

temp = ((int32_t)a << SCALE_SHIFT) / b;

c = (int16_t)temp;

printf(“%d / %d = %d\n”, a, b, c);

printf(“%f / %f = %f\n”, a * SCALE_FACTOR, b * SCALE_FACTOR,

c * SCALE_FACTOR);

return 0;

}

Expected Output

1234 * 8765 = 330

0.037659 * 0.267487 = 0.010071

1234 / 8765 = 4613

0.037659 / 0.267487 = 0.140778

Complete Communications Engineering