Assemblers and Assembly Language

Written by Harry Fairhead

Friday, 03 May 2019

Article Index
Assemblers and Assembly Language
The Abstraction Explosion
Just Enough Abstraction

Page 1 of 3

The sort of instructions that most computers recognize are too simple for humans to be bothered with - and so we invented assembly language. Find out how it works and how it started the whole movement to abstract away from the computer's hardware.

What Programmers Know

knowcover

* Recently revised

We have already tackled what a computer program is but what we completely ignored is the question of how a computer program is created in the first place. The natural language of most computers is binary because this is what their hardware is based on.

When you write an instruction in binary it is more than just an abstract idea that something should be done and some how the machine mysteriously "understands" and does as it is told. The individual bits in the instruction control what the hardware does. One bit might select a particular register, another set the arithmetic unit to add and yet another might clear a status bit. The binary code for an instruction is the set of "levers" that makes the machine do what you want it to.

You can program a computer in binary if you really want to. However programming in binary isn’t fun and it isn’t very productive.You have to remember what each bit does and how to put them together to make the instructions you want. One single bit in error and the instruction means something else. Reading binary code is also a difficult thing for an average human. Even so in the early days this is how computers were programmed - often using the banks of switches and lamps on the front panels.

A better way of expressing things was quickly invented.

Mnemonic Codes

The first step away from pure binary or machine code was to make use of symbols for the instructions.

For example, in x86 machine code the instruction to store an eight-bit value, xxxxxxxx, where x is either a 0 or 1, in register BL is:

10110000xxxxxxxx

For example, if the data is 01010101 the complete command would be:

1011000001010101

This is slightly easier to read if it is written in hexadecimal as B055 which loads 55 hex into the BL register. (Notice that the B at the start of the instruction is just a coincidence and it has nothing to do with the name of the BL register.)

Hex may be better than binary but it still isn’t memorable for most normal humans. The solution is to use easy-to-remember symbols or mnemonics.

For example:

MOV BL, 055H

is naturally read as “move 55 hex into register BL”. This representation also hides the fact that all of the different “move” instructions have different binary machine codes. For example, to move the 16-bit value into the BX register you would use:

10111000

followed by the 16 bits you wanted to load but as a mnemonic it would be:

MOV BX,01234H

which looks lot more like the first move instruction.

In other words, the use of mnemonics not only makes things easier to read, it unifies the structure of the machine code by grouping all of the different move instructions as MOV followed by what should be moved from source to destination, i.e.

MOV destination,source

Notice that this is more sophisticated than you might think in that the instruction that the programmer thinks of as one single instruction corresponds to a very large set of machine code instructions that depend on what is being moved and to where.

Let's Build An Assembler

At first these mnemonics were just that – aids to memory.

The programmer wrote out the program using the mnemonics and then translated the program by hand, using a list of codes, into machine code.

But then some smart programmer had an idea that saved a great deal of work. After all programmers are all basically lazy, it’s the main reason for being a programmer!

Someone thought,

“what if I write a program to convert the mnemonics into machine code”.

Obvious really.

All that is needed is a big table of mnemonic to machine code conversions. The result of this idea was the first “assembler” and the first “assembly” language.

The terminology wasn’t quite stable in the early days and you will find that some earlier assembly languages were called “autocode” and many other things. It doesn’t matter what you call it, the assembler idea is just the combination of using mnemonics to represent machine codes and using a program to translate them to machine code before running the program.

This is the start of a development process that goes from this basic use of mnemonics to all the way to the sophisticated programming languages we use today.

Prev - Next >>

Last Updated ( Saturday, 04 May 2019 )

Recent Articles

Recent Book Reviews

Popular Articles

What Programmers Know

Contents

Mnemonic Codes

Let's Build An Assembler