Page 1 of 3
Where you store data is as important to the computer as the data itself, yet the importance of the address is often overlooked. In this introduction to the low-level mechanisms of addressing in assembler, it is surprising how easy it is to recognize familiar high-level abstractions.
We know where you live
The two fundamentals of the computer’s universe are data and address.
The data is what you store and the address is where you store it.
Most of the time we concentrate on the data – what is stored and how it is stored - and the address is ignored as a simple number that retrieves the data.
In the real world of practical computers, however, the situation is very different. The data is almost boring but how you find it requires a complex and exciting range of techniques. It's all about knowing where the data lives once you have found it generally what you do with it is fairly simple.
So let’s take a closer look at the way addresses are generated and used.
The whole principle of computer operation is that an address automatically selects a memory location and data is either stored there or retrieved equally automatically. In another article we look at the way that the processor, or CPU, interacts with memory to run a program. The most important point however is that an address is not a passive label - it actively selects the data. That is there is a mechanism in place where you present the address and the data is more-or-less instantly retrieved without you having to do anything extra.
Inside the CPU are special areas of storage called “registers”.
Exactly what you call these registers varies from machine to machine but most have something that corresponds to a program counter or PC register and some general purpose registers.
The program counter (PC) is a register that holds the address of the next instruction to be obeyed, i.e. to be fetched from memory, decoded and executed.
Other registers are used to store and operate on the data – the A register or Accumulator, for example.
Notice that the A register can be thought of as a “data register” and the PC register can be thought of as an “address” register.
In hardware terms these differences amount to which bus – address or data – the register is connected to. In more advanced machine designs the distinction between address or data registers becomes blurred and a register may operate as a data register one moment and as an address register the next.
Inside the Pentium
To make this discussion a little more realistic let’s take the case of a real processor – the Intel Pentium and the whole of the x86 family - because there hasn’t been much change since they were first introduced.
The Pentium has a 32-bit register called EAX which is essentially the A register, or Accumulator. This started life as an 8-bit A register, grew to a 16-bit AX register and then doubled in size to become the EAX 32-bit register. However, in addition to the EAX register the Pentium has more than seven others to keep us occupied – what do they all do?
What’s inside the Pentium
In the Pentium’s assembly language a simple instruction to load the EAX register is written:
where address is the 32-bit address that EAX is to be loaded from.
Notice that we are using a convenient mnemonic to write down the machine code. This mnemonic is called “assembler” and it is converted to machine code, i.e. raw binary instruction codes, before it is run by the processor.
That is, the MOV is replaced by the binary op code that is translated by the processor into move to the EAX register and the 32-bit address that follows. This conversion is just a matter of looking such things up in a table of instruction codes but most programmers prefer to use a program – called an assembler – to do the job.
Now consider the sort of thing that you generally want to do in a program.
Suppose you only want to load the EAX register with the value 01. You could store the value 01 in a suitable memory location and then load it into the EAX register but this a round about way to do things.
What about making life easier and introducing a new instruction that loads the EAX register with the 32-bit value that is included in the instruction?
Think about this for a moment. If you want to load a register with a number then the dumb way of doing it is to use an address to locate the value in memory. The address is stored as part of the instruction and makes it necessary to use the address to retrieve the value from memory. Wouldn't it be simpler to just put the value that you want to load into the instruction in place of the address?
This isn’t difficult to achieve and in a way it is just a shortened version of the “store in a memory location and then load” method. In this case the memory location is just the one that the instruction was stored in.
This sort of “wouldn’t it be easier if” approach is how processors slowly but surely become more complicated and more sophisticated. Although everything could be done with just data stored in memory and instructions that contain the address of that data it turns out to be much more powerful to invent different ways - addressing modes - to specify where the data is.
One by one we tend to add “addressing modes” to make the programmer’s life easier.
Time to take a look at some of the standard addressing modes.