|Getting Started With .NET IL|
|Written by Harry Fairhead|
|Tuesday, 18 February 2014|
Page 1 of 2
Do you need to understand IL to write good .NET code? Possibly not, but it makes no sense not to understand IL when it's so easy. Get started with Intermediate Language - now.
The assembly language of .NET
If you already program in almost any .NET language you will know that it isn’t compiled to machine code but to MS Intermediate Language, or IL. (Also known as Microsoft Common Intermediate Language or CIL.)
In this sense IL is the assembly language of .NET and like all assembly languages a knowledge of it helps you understand how things work and how to make them work better.
However, this assembly language isn’t quite what you might expect. If you already know a machine assembly language like x86, PowerPC or Pic, then you will be prepared for some of the low level ideas in IL but you might well be shocked to discover how “high” this intermediate language is.
Indeed there is the argument that it’s much easier to understand if you already program in, say, C#.
The key features of IL are:
Let’s take a look at each of these aspects in turn.
Hello World in IL
You would doubtless be disappointed without a “Hello World” example so let’s begin with the very simplest IL program that does something – i.e. displays Hello World in a console.
To do this you first need to have a copy of the IL assembler ILasm.exe. This is included with the .NET Framework and is available for 32-bit and 64-bit machines.
It is also installed along with Visual Studio and the Windows SDK.
You can also use the ILasm.exe included with the Mono .NET download for a range of platforms including Windows, Mac OSX, Linux etc.
Notice that you don’t need Visual Studio or even any of the “Express” development environments installed. Surprisingly, Visual Studio doesn’t actually support IL development.
As long as you have the full .NET Framework installed you will find ILasm in
where the x.y.z is the version number of the Framework.
You need to set up a command Window with a Path set to the assembler’s location and the directory that contains the files you want to assemble.
In case you have forgotten DOS this is achieved by starting a command prompt and entering:
CD C:\folder that you are working in
For .NET 4.0 (and 4.5 which upgrades the 4,0 installation) the commands would be:
You can use any text editor that can produce plain ASCII files to create .IL source files - I used Notepad.
The simplest IL program is very simple indeed.
Enter the following lines and save the result as Hello.IL. (If you are using Notepad remember surround the file name in double quotes when you save “Hello.IL” otherwise you end up with a file called Hello.IL.TXT.)
To assemble this to an .EXE the command is:
This should produce a set of messages that look something like the screen dump below:
A successful assembly
As long as everything has worked you should see a file called Hello.exe in the same directory as Hello.il. If you run this at the command prompt it prints the message as promised.
What is interesting about this simple example is that it illustrates all of the major characteristics of the assembler.
Assembler directives begin with a dot and the first directive:
informs the assembler that we are going to be using objects and methods within the mscorlib assembly, i.e. the console class and its WriteLine method.
Notice that already we have objects and the .NET Framework involved in our assembler.
The next two directives simply give the program that we are creating an assembly and module name. Without these the assembler and the runtime don’t really know what to do with our program – declaring it to be an assembly means that it can be run.
The next line declares a class and states that it inherits from Object
We then define a CIL managed static method:
All more evidence of object orientation and use of the Framework.
The entrypoint directive marks where the program should be started from and every runnable program has to have one:
Finally we get to some IL instructions, and it’s all over very quickly!
The ldstr, i.e. LoadString, instruction loads the string “Hello World” onto the stack. The call instruction calls the WriteLine method of the static Console class.
The method picks up its parameters from the stack and what looks like a parameter definition, i.e. (string), is a type definition that says that the top of stack item is to be a string. The void return type means that the method doesn’t leave a return value on the stack. The ret, or Return, completes the method and our program.
As you can see even this simple example demonstrates how stack- and object-oriented the language is and how it uses both typing and the Framework.
Of all of the features of IL, the one that high level language programmers tend to find strange is the central role the stack plays.
In this case the stack is a little more sophisticated than a simple block of memory with a pointer. You need to think of it in terms of a strongly typed stack made up of “slots” that hold a complete data type.
When you push data onto and pop data off the stack it always works in terms of a complete data type. Nearly all IL instructions work by popping input data from the stack and pushing their results on the stack.
As an example, let’s add two numbers together.
The first instruction is ldc or LoaD Constant
This pushes the 4-byte integer constant, i.e. an Int32, onto the stack. The ldc part of the instruction tells you what is to happen and the code after the dot tells you the data type - i4 is a four byte integer in this case. There are ldc instructions for all the standard data types - for example ldc.r8 pushes a float32 on the stack.
To perform an add we need two values on the stack so we need to push another int32 and then give the add command:
The add command takes the top two items on the stack adds them together and pushes the result back on the top of the stack.
Now we have the result of adding 1 and 2, i.e. 3, on the top of the stack and we can use WriteLine again to display the result and issue a return to complete the job:
Notice that everything is still strongly typed and the add instruction can discover the type of the two items on the top of the stack and push an appropriate type back on the stack – you can try the same with floating point numbers:
The range of primitive data types available to you is similar to those in C# or VB with some changes to the names used.
|Last Updated ( Tuesday, 18 February 2014 )|