|Fundamental C - Expressions|
|Written by Harry Fairhead|
|Tuesday, 24 May 2022|
Page 1 of 4
This extract, from my book on programming C in an IoT context explains the fundamental importance of the expression. From simple to sophisticated you have to master it.
Fundamental C: Getting Closer To The Machine
Now available as a paperback and ebook from Amazon.
Also see the companion volume: Applying C
One of the main workhorses of any computer language is the expression. This is often under-appreciated as we tend to think of the expression as simply a translation of arithmetic to a programming language – it is much more. The expression is a mini-language in its own right and consists of a set of rules for converting a set of individual data entities into a final result.
The most visible part of the set of rules is the operator which governs what is done to the data, but it isn’t the only component of expression evaluation.
The main trouble with operators is that we know the four arithmetic operators so well that we tend to take them for granted and so miss most of the important ideas.
The best known example of an operator expression is the arithmetic expression, such as:
At first sight this looks like a single command to perform arithmetic, but it isn't. In fact it is a small program in its own right. It is composed of smaller commands and it has a flow of control.
This idea, that an expression is a small program, is remarkably obvious if you write in assembler because most assemblers don't support arithmetic expressions and so the expression program has to be written out using standard commands.
Notice that computers in general can’t do arithmetic or any operation in a memory location. Instead the contents of the memory have to moved to a special internal memory location called a register. A register is a memory location that has additional hardware to allow operations such as arithmetic to be performed.
For example, using a single-register machine, the assembler equivalent of A=3+B*C would be, using an easy-to-understand pseudo assembly language:
LOADREG B load the register from memory location B MULTREG C multiply the register contents by C ADDREG 3 add 3 to the register STOREREG A store the contents of the register in A
This is much more clearly a little subroutine to work something out than A=3+B*C is, but they are entirely equivalent.
Any compiler writer will tell you that the most difficult bit of any compiler is the translation of expressions into good code. Well at any rate it used to be, but now that the theory is reasonably well understood it's more or less a text book exercise.
What makes operator expressions interesting is that they have a complex set of rules for determining the flow of control through the component parts of the expression.
For example, in the expression A=3+B*C all programmers know that the * operation is done before the + operation but that doesn't correspond to the order that they are written in.
In a simple left to right reading of the expression, the + should come first and you do get different results depending on the order that the instructions are obeyed. In a program the order of execution is usually from top to bottom and/or from left to right so expressions really are different. The basic idea is that each operator has a precedence associated with it and the order that each operation is carried out depends on this precedence.
In the example above * has a higher precedence than + and so it is executed first. Notice that this precedence can also be seen in the assembly language version of the expression where the last part of the expression was evaluated first.
Of course, you can always use parentheses to explicitly control the grouping of the operators - an expression in parentheses will be treated as the single value that it evaluates to - a sub-expression.
One subtle point is that although most languages, including C, evaluate expressions according to precedence rules, most do not guarantee the order of evaluation of sub-expressions. For example, in evaluating the expression:
the compiler can choose to work out (1+2) or (3+4) first. Of course, in this case it makes no difference to the result, but there are exceptions - specifically when the sub-expressions have side effects, i.e. change the state of the program. There is much more to say about side effects later.
So far this is about as much as most programmers know about expressions, but there is a little more. For example, an operator can operate on different numbers of data values. The common arithmetic operations are dyadic or binary, that is they operate on two values as in 1+2 but there are also plenty of monadic or unary operators that operate on a single value such as -2,
In general an operator can be n-adic without any difficulty apart from how to write them as part of expressions. For example, if I invent the triadic operator @ which finds the largest of three values, the only reasonable way to write this is as:
@ value1, value2, value3
and in this guise it looks more like a function than an operator. This is because there is a very close association between functions and operators. Put simply, you can say that an operator is simply a function that has a priority associated with it. For example, rather than the usual arithmetic operators we could easily get by with the functions ADD(a,b), SUB(a,b), MULT(a,b) and DIV(a,b) as long as they had the same priorities assigned to them.
The question of how to write operator expressions neatly has exercised the minds of many a mathematician. The usual notation that suits dyadic operators, i.e. A+B is called infix notation, but it doesn't generalize to n-adic operators.
|Last Updated ( Tuesday, 24 May 2022 )|