Assemblers and Assembly Language
Written by Harry Fairhead   
Friday, 03 May 2019
Article Index
Assemblers and Assembly Language
The Abstraction Explosion
Just Enough Abstraction

The sort of instructions that most computers recognize are too simple for humans to be bothered with - and so we invented assembly language. Find out how it works and how it started the whole movement to abstract away from the computer's hardware.

What Programmers Know



  1. The Computer - What's The Big Idea?*
  2. The Memory Principle - Computer Memory and Pigeonholes*
  3. Principles of Execution - The CPU
  4. The Essence Of Programming
  5. Variables - Scope, Lifetime And More*
  6. Binary Arithmetic
  7. Hexadecimal*
  8. Binary - Negative Numbers*
  9. Floating Point Numbers*
  10. Inside the Computer - Addressing
  11. The Mod Function
  12. Recursion
  13. The Lost Art Of The Storage Mapping Function *
  14. Hashing - The Greatest Idea In Programming
  15. Advanced Hashing
  16. XOR - The Magic Swap*
  17. Programmer's Introduction to XML
  18. From Data To Objects*
  19. What Exactly Is A First Class Function - And Why You Should Care*
  20. Stacks And Trees*
  21. The LIFO Stack - A Gentle Guide*
  22. Data Structures - Trees
  23. Inside Random Numbers
  24. The Monte Carlo Method
  25. Cache Memory And The Caching Principle
  26. Data Compression The Dictionary Way
  27. Dates Are Difficult*
  28. Sequential Storage*
  29. Magic of Merging*
  30. Power of Operators
  31. The Heart Of A Compiler*
  32. The Fundamentals of Pointers
  33. Functional And Dysfunctional Programming*

* Recently revised

We have already tackled what a computer program is but what we completely ignored is the question of how a computer program is created in the first place. The natural language of most computers is binary because this is what their hardware is based on.

When you write an instruction in binary it is more than just an abstract idea that something should be done and some how the machine mysteriously "understands" and does as it is told. The individual bits in the instruction control what the hardware does. One bit might select a particular register, another set the arithmetic unit to add and yet another might clear a status bit. The binary code for an instruction is the set of "levers" that makes the machine do what you want it to.

You can program a computer in binary if you really want to. However programming in binary isn’t fun and it isn’t very productive.You have to remember what each bit does and how to put them together to make the instructions you want. One single bit in error and the instruction means something else. Reading binary code is also a difficult thing for an average human. Even so in the early days this is how computers were programmed - often using the banks of switches and lamps on the front panels.

A better way of expressing things was quickly invented.

Mnemonic Codes

The first step away from pure binary or machine code was to make use of symbols for the instructions.

For example, in x86 machine code the instruction to store an eight-bit value, xxxxxxxx, where x is either a 0 or 1, in register BL is:


For example, if the data is 01010101 the complete command would be:


This is slightly easier to read if it is written in hexadecimal as B055 which loads 55 hex into the BL register. (Notice that the B at the start of the instruction is just a coincidence and it has nothing to do with the name of the BL register.)

Hex may be better than binary but it still isn’t memorable for most normal humans. The solution is to use easy-to-remember symbols or mnemonics.

For example:

MOV BL, 055H

is naturally read as “move 55 hex into register BL”. This representation also hides the fact that all of the different “move” instructions have different binary machine codes. For example, to move the 16-bit value into the BX register you would use:


followed by the 16 bits you wanted to load but as a mnemonic it would be:

 MOV BX,01234H

which looks  lot more like the first move instruction.

In other words, the use of mnemonics not only makes things easier to read, it unifies the structure of the machine code by grouping all of the different move instructions as MOV followed by what should be moved from source to destination, i.e.

MOV destination,source

Notice that this is more sophisticated than you might think in that the instruction that the programmer thinks of as one single instruction corresponds to a very large set of machine code instructions that depend on what is being moved and to where.


Let's Build An Assembler

At first these mnemonics were just that – aids to memory.

The programmer wrote out the program using the mnemonics and then translated the program by hand, using a list of codes, into machine code.

But then some smart programmer had an idea that saved a great deal of work. After all programmers are all basically lazy, it’s the main reason for being a programmer!

Someone thought,

“what if I write a program to convert the mnemonics into machine code”.

Obvious really.

All that is needed is a big table of mnemonic to machine code conversions. The result of this idea was the first “assembler” and the first “assembly” language.

The terminology wasn’t quite stable in the early days and you will find that some earlier assembly languages were called “autocode” and many other things.   It doesn’t matter what you call it, the assembler idea is just the combination of using mnemonics to represent machine codes and using a program to translate them to machine code before running the program.

This is the start of a development process that goes from this basic use of mnemonics to all the way to the sophisticated programming languages we use today.





Last Updated ( Saturday, 04 May 2019 )