Applying C - Assembler
Written by Harry Fairhead   
Monday, 11 November 2019
Article Index
Applying C - Assembler
Rotate A Global
Adding Two Values
Rotate A Variable

Sometimes the simplest thing to do is to move one level lower and write some assembler. This extract is from my  book on using C in an IoT context.


Now available as a paperback or ebook from Amazon.

Applying C For The IoT With Linux

  2. Kernel Mode, User Mode & Syscall
  3. Execution, Permissions & Systemd
    Extract Running Programs With Systemd
  4. Signals & Exceptions
    Extract  Signals
  5. Integer Arithmetic
    Extract: Basic Arithmetic As Bit Operations
  6. Fixed Point
    Extract: Simple Fixed Point Arithmetic
  7. Floating Point
  8. File Descriptors
    Extract: Simple File Descriptors 
    Extract: Pipes 
  9. The Pseudo-File System
    Extract: The Pseudo File System
    Extract: Memory Mapped Files ***NEW
  10. Graphics
    Extract: framebuffer
  11. Sockets
    Extract: Sockets The Client
    Extract: Socket Server
  12. Threading
    Extract:  Pthreads
    Extract:  Condition Variables
    Extract:  Deadline Scheduling
  13. Cores Atomics & Memory Management
    Extract: Applying C - Cores 
  14. Interupts & Polling
    Extract: Interrupts & Polling 
  15. Assembler
    Extract: Assembler

Also see the companion book: Fundamental C






This is a book on using C in a POSIX or Linux environment so why should we end with a chapter on assembly language? The reason is that C is close to the machine and this means that your C program isn't that far away from the assembly language the compiler creates. In other words, there is a strong affinity between C and assembler. This book is also about low-level programming as encountered in the IoT, and embedded applications in general, and in this area assembler is sometimes the only way to achieve a result that needs speed or access to hardware that the software makes difficult to get at.

Of course, we have a problem in that if you are going to write assembly language then you need to know how. The good news is that it isn't difficult to learn assembler. The only slight problem is that are two common dialects for the x86 - AT&T and Intel, and there are differences between x86 and x64. There are similar variations for the ARM and other processors.

C Assembler as Text Insertion

The most important thing to remember is that GCC compiles your C program to human-readable assembler. The simplest way to add custom assembly language code into your program would be to simply insert the text into the assembly language file that the compiler creates. You could manually edit the assembler, but you would have to do this following each fresh compile. What the GCC compiler provides is a way for you to tell it what assembly language you want to insert.

The C standard way of doing the job is to use the basic asm command:

__asm__ (assembler instructions);

The assembler instructions are represented by a single string or a set of strings, one for each line of assembler. You have to terminate each line correctly for the assembler in use. The GCC assembler is happy with \n\t i.e. newline and tab or a semicolon. Basically, you have to conform to the rules of the assembler that is in use. The text that you write as assembler instructions is inserted into the compiler's assembly language output at the point it is encountered. So what you need to write in is governed by the assembler. You can generally find out what the conventions are by simply writing a small C program and compiling it with the -S option to generate an assembler file with extension .s. Using the compiler's assembly language output you can see what effect your inline assembler is having and this is the simplest way of finding out how to modify it to make it work.

The basic assembler command is limited to being run at the top level, which means you can't put it into a function, and there are no facilities to let you work with variables in your C code. There may be no facilities, but you can still do the job if you examine the assembler output of the compiler.

For example, most C compilers do not mangle the names of C variables, as C++ does. At worst, a compiler might add an underscore to the start of the name. You can find out what happens in a particular case by declaring a global variable and looking at the assembler generated. GCC on x86 and ARM doesn't mangle names, so you can simply use C variable names in your assembler. Notice that local variables are stored on the stack, and names are not retained in the assembler output. You can access local variables from assembler, but it is more difficult.

If you place a basic asm block into a function, including main, then it is automatically treated as an extended asm block, as covered in a later section. However, if you don't take advantage of its extra features, it just looks like a basic asm block.


As an example, let's write an assembly language program that adds 1 to a C variable myA and stores the result in myB. Of course, the code depends on what system you are using. For an x64 processor we have, (the code for a 32-bit processor is slightly different):

#include <stdio.h>
#include <stdlib.h>
int myA = 1;
int myB;
int main(int argc, char** argv) {
    __asm__ (
            "pushq %rax \n\t;"
            "pushq %rbx \n\t"
            "movl myA(%rip),%eax \n\t"
            "movl $01, %ebx\n\t"
            "addl %ebx, %eax\n\t"
            "movl %eax,myB(%rip) \n\t"
            "popq %rbx \n\t"
            "popq %rax \n\t"
    printf("%d", myB);
    return (EXIT_SUCCESS);


If you know x86 or x64 assembler, this should be easy to understand. Notice that by default, GCC generates AT&T-style assembly language, but you can set an option to generate Intel style. The main difference is that AT&T has the source first and the destination second, whereas Intel uses the opposite order. So:

"movl myA(%rip),%eax \n\t"

stores the contents of the C variable myA into the eax register. In x64 compiled assembler all globals are addressed relative to the rip register. There is no way you can work this out from first principles, you just have to look at the assembly output and see how it is done in any particular case. After this we store 1 in the ebx register and then add ebx to eax. The final instruction stores eax into the C variable myB. Notice that, to avoid interfering with the rest of the program, we have to save and restore the two registers we use.

The same program is easy enough to create for ARM, but the conventions are different and, again, you need to look at the assembler output to find out how the compiler does things. The same program for the Raspberry Pi's ARM is:

#include <stdio.h>
#include <stdlib.h>
int myA = 1;
int myB;
int main(int argc, char** argv) {
    __asm__ (
            "PUSH {r1,r2,r3,r4} \n\t"
            "ldr r4,=myA \n\t"
            "ldr r2,[r4] \n\t"
            "mov r3,#1 \n\t"
            "add r1,r2,r3 \n\t"
            "ldr r4,=myB \n\t"
            "str r1, [r4] \n\t"
            "POP {r1,r2,r3,r4} \n\t"
    printf("%d", myB);
    return (EXIT_SUCCESS);

It is slightly more complicated because of the structure of the ARM processor. The PUSH and POP are macros that the assembler converts into a set of equivalent instructions. The instruction:

"ldr r4,=myA \n\t"

is a GCC-specific extension that loads a register with a relative reference to the variable into the register. This is then used in:

"ldr r2,[r4] \n\t"

to store the value at the address in r4 into r2. We then load r3 with 1 and add r2 to r3, storing the result in r1. Finally, the same GCC-specific instruction is used to get the address of myB into r4 and then a str instruction is used to store the contents of r1 into the location given by the address in r4. If you are not familiar with ARM assembler, it is worth saying that in str the source and destination are the other way round - presumably so that the register can be specified first.

Last Updated ( Monday, 11 November 2019 )