|Fundamental C - Variables|
|Written by Harry Fairhead|
|Monday, 08 April 2019|
Page 2 of 3
The Numeric Data Types
C only has two fundamental data types – integer and floating point. Everything else is constructed using these two. Integer types, as the name suggests, store integer values, i.e. no decimal fractions are stored.
C has a range of integer data types, but how they are implemented depends, just like the basic int, on the architecture of the machine. However, the most commonly encountered sizes are:
The long long type was defined in C99.
The char type is often assumed to be exactly 1 byte, but this is just its most common implementation. The reason it is called char is that it has to be capable of storing the basic character set codes of the machine. Notice that it can be a signed value or an unsigned value, again this depends on the machine. If you want to be certain that char is signed you can use the qualifier signed in front of char to give signed char. If you want the values stored in a variable to be unsigned, i.e. just positive integers, you can use the qualifier unsigned in front of the type. For example, unsigned char, means a single byte can store values in the range 0 to 255.
There is one final complication. You can put int after any of the length qualifiers. So a short can be declared as short int, a long as long int and a long long as long long int. This is mostly a matter of preference. Many C programmers prefer the shortest form of a declaration so instead of long long int they would use long long.
Notice that while most implementations of C have these data types, it might not be wise to make use of them if they are not efficiently supported by the hardware. For example the ARM 11 architecture is 32-bit and hence short actually uses as much storage as an int and takes longer to do arithmetic. It is not necessarily true that smaller is better.
Sections in the chapter but not in this extract:
We work with variables all the time and we think we know what a variable is, but how languages implement variables is one of the big differences between them. C takes the most basic approach possible, one that, as you might expect, is close to the way the machine does things.
When you declare a variable the right amount of memory is allocated for it somewhere – exactly where is discussed in Chapter 10. That area of memory has an address and its address is used in all the assembly language instructions that the compiler generates that need to work with that variable. Consider the simple program:
int i; i=0;
When the compiler processes this, it generates:
! i=0; main+11: movl $0x0,0x407020
The movl (move long) instruction stores the value $0x0, i.e. 0, in address 0x407020. You can see that the variable is reduced to the address and there is no sign of anything called i in the assembler that is generated. The declaration of i did not generate any assembler. It simply told the compiler that it should allocate some memory and use its address anywhere that i is used in the program.
What happens is that when the compiler encounters a variable declaration it stores the variables name in an internal table – the symbol table. It then allocates some memory for the variable and stores the address in the symbol table along with the variable name. From this point on if you use the variable name the symbol table is used to look the name up and retrieve the address which is then used in the assembler being generated. When the compiler has finished the symbol table is deleted – it is not part of your program and hence the names of all of the variables are not part of your program. The exception is when you make use of a debugger. In this case the compiler passes the symbol table to the debugger and it uses it to convert the addresses used in your program to the names of variable. In this way you can carry on in the belief that your program uses named variables.
Although we haven’t met the idea of pointers as yet, it is worth explaining now that this is an example of a constant pointer. You can think of the address as a pointer to an area of storage and in this sense a variable is a constant pointer, i.e. a pointer that never changes. The value of the constant pointer is used within the assembly language program wherever it is needed.
Exact Size Variables
Most of the time you can work with C's strange approach to variable types because you are targeting a particular machine.
The solution to the problem of how to work with variables that have a definite number of bits is to use the stdint.h header file. This is a library that was introduced in C99 to provide a set of types that are fixed in size, irrespective of the machine in use. Of course, the implementation might not be the most efficient possible on the machine.
The library introduces new types of the form:
for signed and unsigned integers with N bits. The only values of N that have to be implemented in the library are 8, 16, 32 and 64. The signed types are implemented as two's complement. Note that the implementations have to be exact and no padding bits are allowed.
To use the library you have to add:
to the other automatically generated includes in your program. So, for example:
is guaranteed to be an 8-bit variable and:
is guaranteed to be a two-byte unsigned int.
This all works well, but if the machine doesn't support int16_t as a native two-byte int the results could be very slow. There is an alternative which lets you specify either the minimum width that can be used:
or the fastest minimum width:
will create a variable that is as small as possible, but still greater than or equal to 8 bits. The fast version gives you a variable that is as small as possible, but with the extra condition that it is fast. So, for example:
will be at least 8 bits, but it might be larger if it is faster to use a bigger variable type. Exactly how these are implemented is left up to the compiler writer.
|Last Updated ( Monday, 29 April 2019 )|