A Simple Virtual Machine
Written by Alexey Lyashko   
Wednesday, 01 February 2012
Article Index
A Simple Virtual Machine
Pseudo Assembly Language
Virtual Machine

Virtual Machines have more uses than you might imagine. We have a real example of how a VM can be used to increase the security of your code.


In computing a Virtual Machine (VM) is a software implementation of either an existing or a fictional hardware platform.

VM's are generally divided into two classes -

  • system VMs
    a VM which is capable of running an operating system)


  • process VMs
    roughly speaking ones that can only run one executable..

Anyway, if you are just interested in the definition of the term read the article on Wikipedia.

There are tons of articles dedicated to this matter on the Internet, hundreds of tutorials and explanations. I see no reason to just add another "trivial" article or tutorial to the heap.

Instead, I think it may be more interesting to see the idea in action, to have an example of a real application. One may say that we are surrounded by examples - Java, .NET, etc. This is correct, however, I would like to touch on a slightly different application of this technology - protecting your software/data from being hacked.

Data Protection

Millions of dollars are being spent by software (or content) vendors in an attempt to protect their products from being stolen or used in any other illegal way. There are numerous protection tools and utilities, starting with simple packers/scramblers and ending with complex packages that implement multilevel encryption and virtual machines as well.

However, you may disagree, but you won't convince me, that an out-of-the-box solution is good until it gains popularity. There is enough evidence for this statement. In my opinion, no one can protect your software better than you. It only depends on how well protected you want it to be.

Although, there are numerous protection methods and techniques, we are going to concentrate on a virtual machine for data coding/decoding. Nothing special, just a trivial XOR method, but, in my opinion, enough to demonstrate the fundamentals.

Design Your VM

While in real life, hardware design precedes its software counterpart, we can do it in reverse order (it is our own VM, after all). Therefore, we will begin with the pseudo executable file format which will be supported by our VM.

A good idea is to put a header in the beginning of the file.

In order to do so, we have to think what our file is going to contain. The file may be a raw code (remember DOS com files?), but this would not be interesting enough.

So, let our file be divided into three sections:

  • code section - this section would contain code written in our pseudo assembly language (we'll cover it a bit later);

  • data section - this section would contain all the data needed by our pseudo executable (PE );

  • export section - this section would contain references to all the elements that we want to make visible to the core program.

Let us define the header as a C structure:
typedef struct _VM_HEADER
 unsigned int version;
/* Version of our VM. Will be 0x
 unsigned int codeOffset;
/* File offset of the code section */ unsigned int codeSize;
/* Size of the code section in bytes */ unsigned int dataOffset;
/* File offset of the data section */ unsigned int dataSize;
/* Size of the data section in bytes */ unsigned int exportOffset;
/* File offset of the export section */ unsigned int exportSize;
/* Size of the export section in bytes */ unsigned int requestedStack;
/* Required size of stack in 4 bytes blocks */
unsigned int fileSize;
/* Size of the whole file in bytes */

Well, one more thing.

Actually the most important one. We need a compiler for our pseudo assembly that would be able to output files of this format.

Fortunately, we do not have to write one (although, this may be an interesting task). Tomasz Grysztar has done a wonderful work with his Flat Assembler. Despite the fact, that this compiler is intended to compile Intel assembly code, thanks to the wonderful macro instruction support, we can adapt it to our needs. The skeleton source for our file would look like this:

include 'defs.asm'  
; Definitions of our pseudo
; assembly instruc

; Header =======================

h_version   dd 0x101
h_code      dd _code
h_code_size dd _code_size
h_data      dd _data
h_data_size dd _data_size
h_exp       dd _export
h_exp_size  dd _export_size
h_stack     dd 0x40
h_size      dd size
; Code ========================= _code:
;some pseudo code here

= $ - _code
; Data ======================

;some data here

= $ - _data
; Export =======================

;export table structures here

= $ - _export
= $ - h_version
as simple as that.


The export section deserves special attention. I tried to make it as easy to use as possible. It is divided into two parts:
  1. Array of file offsets of export entries terminated by 0;
  2. Export entries:
    1. File offset of the exported function/variable (4 bytes);
    2. Public name of the exported object (NULL terminated ASCII string);

In the above example, the export section would look like this:

; Array of file offsets
dd  _f1 ; Offset of '_f1' export entry
dd  0    ; Terminating 0
; List of export entries
dd  _function ; File offset
db  'exported_function_name',0
; Public name
Save the file as 'something.asm' or whatever name you prefer. Compile it with Fasm.

Last Updated ( Wednesday, 01 February 2012 )