|Fundamental C - Arithmetic and Representation|
|Written by Harry Fairhead|
|Monday, 27 May 2019|
Page 2 of 2
Unsigned Integer Arithmetic
So far we have only considered representing positive integers, i.e. unsigned types. All unsigned integer types are a fixed number of bits in size and this raises the question of what happens when you add one to the largest value that can be represented. For example, an unsigned char with eight bits can represent 0 to 255 or 0x00 to 0xFF. What happens if you try to work out 255+1?
If you know the standard answer, you may never have considered the alternatives. For example, you could define 255+1 to be 255 - that is, once you reach the maximum value you don’t go any higher. This is called saturation or clamped arithmetic and while most general purpose processors don’t use it, some special devices such as DSPs (Digital Signal Processors) do.
What most general processors actually do is to roll over. That is, when you add 1 to 255 the answer is 256 which in binary is:
1 0000 0000
i.e. a 9-bit number. As this is being stored in an 8-bit variable then the natural thing to do is just ignore the ninth bit and let the answer be:
That is, 255+1=0. This is rollover in the sense that a mechanical counter will roll over to the start when it goes beyond its maximum. It is as natural for a digital counter as for a mechanical one and it also turns out to be very useful, as we shall see.
Adding one to the largest representable value is often called overflow because the most significant bit of the result is lost. The term overflow suggests that it is an error but the C99 standard states that, for unsigned arithmetic, rollover is correct and not an error or undefined behavior.
Notice that as things stand we only have positive integers. Clearly this isn’t going to be enough, we need to represent negative values as well. In non-computer arithmetic we simply write a minus sign in front of a value to indicate that it is negative. We can do the same with binary by treating the most significant bit as a “sign bit”, i.e. 0 for a positive number and 1 for a negative number. This is simple to think up, but difficult to make work. Now when you are adding two numbers together, you have to look at the sign bits and work out if you should add or subtract. Even so, sign magnitude as it is called, is still in use. For example, 0011 is 3 in 4-bit sign magnitude and 1011 is -3.
There is a much better way of representing negative values which makes use of the standard integer arithmetic hardware found in nearly all processors. Consider for a moment the problem of finding what -x is. It is a quantity that when added to x gives you zero, i.e. x + -x = 0.
If you think back to unsigned arithmetic, you know that because of rollover you can get zero when you add a positive value to another positive value. For example, for an 8-bit char, we have 255+1=0. There is a sense in which 255 plays the role of -1. Similarly, in 254+2=0, 254 plays the role of -2 and so on. In general (256-x) plays the role of -x because (256-x)+x=256, which is zero in eight bits, i.e. 0x0000.
This representation of negative values is called two's-complement and if the number of bits in use is n then -x is given by 2n-x. You can also see that this divides the total unsigned range into two ranges that represent the positive and the negative numbers. You can easily see this by starting to count at 0 and listing the corresponding two's-complement representation of the negative. For example, in eight bits we have:
+ 0 1 2 3 … 126 127 - 0 255 254 253 … 130 129 128
When you add the number in the first row to the number below it in the second row then the answer, allowing for rollover, is zero. Notice that when we reach 128 this behaves like -128 but there is no number that behaves like +128. That is, the ranges are 0 to 127 and 0 to -128.
In general, in two's-complement an n-bit value can represent -2n-1 to 2n-1-1.
Two’s-complement allows you to regard the most significant bit as a sign bit, even though we didn’t set out to design things this way. However, you can no longer treat the other bits as the magnitude of the number.
When you do two's-complement arithmetic the same hardware is used as for unsigned ints. The only difference is the interpretation of what the bits mean. For example, in 8-bit char:
0xFD + 0x02 = 0xFF
and as unsigned this is:
253 + 2 = 255
but viewed as two's-complement it is:
-3 + 2 = -1
It is how you interpret the bits that matters, not what you do with them.
As well as two's-complement there is also one’s-complement, which isn’t used as much because it makes implementing arithmetic slightly harder. It is similar to two's-complement, but instead of subtracting from 2n you use 2n-1. One’s-complement almost works with standard arithmetic but you have to add an “end around carry” rule. That is, any final carry from the most significant bit has to be added to the least significant bit. For this reason two's-complement is more or less the standard choice as it needs no additional rules.
The importance of one’s-complement is that it makes it easy to get a two's-complement value. The one’s-complement of any given value is simply the bitwise Not ~ of the binary representation. That is, the one’s-complement of x is ~x, see Chapter 12 for bitwise operations. As the one’s-complement is one less than the two's-complement, you can compute the two's-complement by adding one, i.e. -x =~x+1.
Final version in book
The problems of overflow are discussed in much more detail in Applying C For The IoT With Linux.
Representations and printf
Final version in book
Fundamental C: Getting Closer To The Machine
Now available as a paperback and ebook from Amazon.
Also see the companion volume: Applying C
or email your comment to: email@example.com
|Last Updated ( Monday, 27 May 2019 )|