Applying C - Fixed Point Arithmetic
Written by Harry Fairhead   
Monday, 11 May 2020
Article Index
Applying C - Fixed Point Arithmetic
Rescaling
Printing Fixed Point

If you are working on an IoT project you cannot avoid calculations. Fixed point arithmetic is still occasionally useful, but it hardly figures in most accounts of how to do calculations. This extract is from my  book on using C in an IoT context.

Now available as a paperback or ebook from Amazon.

Applying C For The IoT With Linux

  1. C,IoT, POSIX & LINUX
  2. Kernel Mode, User Mode & Syscall
  3. Execution, Permissions & Systemd
    Extract Running Programs With Systemd
  4. Signals & Exceptions
    Extract  Signals
  5. Integer Arithmetic
  6. Fixed Point
    Extract Simple Fixed Point Arithmetic
  7. Floating Point
  8. File Descriptors
    Extract: Simple File Descriptors ***NEW
  9. The Pseudo-File System
    Extract: The Pseudo File System 
  10. Graphics
    Extract: framebuffer 
  11. Sockets
  12. Threading
    Extract  Condition Variables
    Extract  Deadline Scheduling
  13. Cores Atomics & Memory Management
  14. Interupts & Polling
  15. Assembler
    Extract: Assembler 

Also see the companion book: Fundamental C

<ASIN:1871962609>

<ASIN:1871962463>

<ASIN:1871962617>

<ASIN:1871962455>

ACcover

 

Integer arithmetic is usually supposed to be good for some things but not for others – which need floating point arithmetic. In fact, for small machines, integer arithmetic might be all you need. In this chapter we look at how to implement fixed point arithmetic, which is just integer arithmetic with a scale factor.

Why Fixed Point?

If you are accustomed to using big computers and modern languages such as Java or Python, you will hardly give a second thought to multiplying or dividing two numbers. The most commonly used format for numeric values is floating point arithmetic and, despite its problems, it is usually the best to use. The problems are to do with accuracy and how rounding errors accumulate. However, for small machines and low-level programming floating point is a problem for another reason – it can be slow. What is more many small machines, the Arduino Uno for example, don’t have floating point hardware and when this is the case you have to resort to a software emulation. The nature of this software emulation depends on the CPU in question, but there are a number of standard floating point libraries.

A modern CPU has a floating point unit (FPU) also referred to as a math coprocessor, that is almost as fast as its integer arithmetic unit. This wasn’t always true and it means that you can no longer assume that fixed point is faster than floating point. Before you make a commitment to using fixed point on a machine that has a modern FPU, you need to benchmark the standard operations of arithmetic. For example on the Raspberry Pi 3, a floating point multiplication is only 30% slower than an integer multiplication. When working with a CPU that implements floating point in a similar time to integer operations, it is unlikely that fixed point will perform better as it has additional steps.

Often, however, external devices deliver their data in a form that is closer to fixed point than anything else. In these cases it is often easier to work in fixed point rather than convert to floating point and perhaps back again.

What all this means is that for much arithmetic you don’t need floating point as fixed point will do the job very well. In most cases you don’t need a fixed point library, if you need one then the chances are it will be the machine architecture that you are working with that pushes you to a specific choice. Otherwise use libfixmath, which is described later.

Fixed Point is Integer Arithmetic

Fixed point arithmetic is just scaled integer arithmetic. For example, suppose you need to work with values in the range zero to a hundred to two decimal places – a temperature say. You can easily represent such a value as an integer by multiplying by a hundred. That is, a temperature of 57.89C would be represented by:

57.89*100 = 5789

You can see that we are using four decimal digits with the lower two dedicated to representing the fractional part of the value.

Things are a little more complicated than this sounds because of the way arithmetic operations affect the scale factor and indeed the need to remember to include the scale factor at all. Essentially we have removed the need to worry about fractional parts of values by converting everything to an integer.

In the above example we have been using a decimal scale factor, but as we know that a shift is almost the same as multiply or dividing by multiples of 2 and is very fast, it is more efficient, if slightly more difficult conceptually, to use a power of two as a scale factor.

Each left shift multiplies by two and adds another bit to the representation of the fractional part of the value:

Fraction Bits

Scale Factor

Precision

1

2

0.5

2

4

0.25

3

8

0.125

4

16

0.0625

5

32

0.03125

6

64

0.015625

7

128

0.0078125

8

256

0.00390625

9

1024

0.0009765625

10

2048

0.00048828125

 

The usual rule of thumb is that for one decimal place you need at least four fraction bits and for two decimal places at least seven, preferably eight.

Of course if you are using n bits for the fractional part of the value you only have 32 – n bits, assuming a 32-bit int, to represent the value. The table below shows the largest value that can be represented in a 32-bit int with the number of bits used to represent the factional part.

Fraction Bits

Unsigned

Signed

1

2147483648

1073741824

2

1073741824

536870912

3

536870912

268435456

4

268435456

134217728

5

134217728

67108864

6

67108864

33554432

7

33554432

16777216

8

16777216

8388608

9

8388608

4194304

10

4194304

2097152

You can see that with 32-bit integers there is a good range of values at reasonable accuracy.

If you select n bits for the fractional part then you can think of the binary point as being at position n. There is no standard notation for fixed point format, but Qm.n, which means m bits for the non-fractional part and n used for the fractional part, is common. For example Q16.16 would be a 32-bit int with 16 fractional bits.

When you enter data you need to scale it so that the binary point is at position n. As you do calculations it is possible that the position of the binary point will change. To avoid this you have to rescale to keep the binary point fixed, hence fixed point arithmetic.

Notice that you could even position the binary point to the right of the value so working with a scaling that allows for very large numbers to be represented. For example, if the binary point is at bit -1 then the scale factor is ½ and a value of x actually represents 2x. This is the binary analog of our usual exponent notation, i.e. the value is x*2s where s is the position of the binary point, which can be negative as well as positive.



Last Updated ( Saturday, 16 May 2020 )