Applying C - Assembler
Written by Harry Fairhead   
Monday, 11 November 2019
Article Index
Applying C - Assembler
Rotate A Global
Adding Two Values
Rotate A Variable

Rotate a Global

Another good example is adding a rotate command to C. For x64 this is fairly easy and to rotate right by two bits you would use:

#include <stdio.h>
#include <stdlib.h>
int myA = 1;
int main(int argc, char** argv) {
    __asm__ (
            "pushq %rax \n\t;"
            "movl myA(%rip),%eax \n\t"
            "ror $2,%eax\n\t"
            "movl %eax,myA(%rip) \n\t"
            "popq %rax \n\t"
            );
    printf("%X", myA);
    return (EXIT_SUCCESS);
}

Notice that you still have an overhead of pushing and popping the register, getting the value from the variable and putting it back again. If the compiler you are using recognizes the usual idiom for rotate then it is likely to generate faster code.

For the ARM, specifically Raspberry Pi:

#include <stdio.h>
#include <stdlib.h>
int myA = 1;
int main(int argc, char** argv) {
    __asm__ (
            "PUSH {r1,r2} \n\t"
            "ldr r2,=myA \n\t"
            "ldr r1,[r2] \n\t"
            "ror r1,r1,#2 \n\t"
            "str r1, [r2] \n\t"
            "POP {r1,r2} \n\t"
            );
    printf("%X", myA);
    return (EXIT_SUCCESS);
}

Again, if the compiler recognizes the usual idiom for ror in C, it is likely to generate faster code.

You can carry on in this way, using the assembler output to guide how you should integrate your code with the generated code, but it isn't the best way to do the job. The problem is that it depends on the way the compiler works and this can change. Optimizations can also change the assembler output in ways that can break your code. A better way to do any complicated assembler task is to use the extended inline assembler command.

Extended Asm

As the asm commands we have been using so far have been within a function, i.e. main, we are already using extended asm blocks, but there are some extras we can take advantage of. You can use special symbols to get the compiler to insert variable references, and even select registers, for you. In other words, you don't write finished, complete assembler to be inserted into the generated assembly language, you write an assembler template that the compiler completes for you.

The simplest of the extended forms of asm is:

__asm__( code template
         : output operand list
         : input operand list
       )

This can only be used inside a function and it needs to be marked as volatile if it has no outputs as otherwise the compiler might optimize it away. The lists following the colons are optional. You can leave out a list by simply writing an empty list.

You might think that the operand lists would just be names of variables, but extended asm is more sophisticated and hence initially more confusing. It uses a system of "constraints" to help the compiler work out how to deal with the C variable. For example, a constraint of r tells the compiler to use a register for the operand. The compiler keeps a table of which registers are in use so that it can allocate a register to use to store the value without having to save and restore the register.

The code template can contain any valid assembler, just as in the case of the basic asm, but you can also include tokens that the compiler will replace with registers or references to variables. All of the tokens start with % and if you need to include a % symbol in your assembler simply double it up to %%. Also if you need to include {, } or | in the code then put a % in front.

The main use of tokens is to identify the operands defined in the output and input operand lists, and to understand how this works we have to look at the way these lists are constructed.

 

Operands in the input or output lists take the form:

[assembler name] constraint (variable name)

where the variable name is the name of the variable in the C program you want to use. The assembler name is the name that will be used in the assembler code for the same variable. The constraint specifies how the variable is to be handled.

If you don't specify an assembler name then you have to refer to operands using their position in the list. For example, %0 is the first operand in the list, %1 the second and so on. For readability you should always specify an assembler name and it is a good idea to make it the same as the variable name. At the moment the limit is no more than 30 operands in total.

The rules for input and output operands are slightly complicated.

Output operands have an initial constraint modifier of =, which means you can only use them as a destination in an instruction. You can use + in place of =, which means they are also input operands and can be used as source in an assembler instruction.

An output operand is automatically transferred to the C variable when the assembler ends. An input operand is automatically transferred from the C variable to a register at the start of the assembler. An output operand that has a + modifier is transferred from and to the C variable when the assembler starts and ends.

If you use an output-only operand as a source in an assembler instruction you cannot be sure it will have the same value as the variable in the C program, as the compiler doesn't bother generating an instruction to initialize the operand from the C variable. This means that there are three types of operand:

output only

included in the output operand list with a modifier of =

output and input

included in the output operand list with a modifier of +

input only

included in the input operand list

As already mentioned, a constraint of r means use a register for the operand. It is the most commonly used constraint, but there are many others including a lot that are machine-specific. For example, m means use a memory location. If you specify rm this means the compiler can use either a register or a memory location according to what is optimal. In general it is better to use as many constraints as allowed to let the compiler work out the best way to do the job. However, it is also possible for the compiler to get it wrong and create an illegal instruction.



Last Updated ( Monday, 11 November 2019 )