Advanced loops
Written by Administrator   
Thursday, 18 March 2010
Article Index
Advanced loops
Mulitple exits
Break and continue
For and efficiency

 

The for loop

The for loop is PHP's implementation of the enumeration loop - i.e. a loop that is supposed to repeat a given number of times. The big problem with the for loop, and this is true of every language that uses a for loop modeled on the C for loop, is that it can do too much - it is just too general.

Banner

The specification for the for loop is amazingly clever but it was originally designed to make such loops fast and efficient and to stay close to the underlying machine language. Today we have little need to stay close to the machine language but we are still stuck with this archaic form of the for loop. This said there are many programmers who have made it their life's work to turn the C-style for into a work of ingenuity if not beauty. They will defend it to the last - erroneously of course.

The PHP for instruction is very simple in construction. It consists of three expressions separated by semicolons:

for(exp1;exp2;exp3){
instructions;
}

this is translated into

exp1;
while(exp2){
 instructions;
exp3;
}

That is exp1 is executed once at the start of the loop, exp2 is executed at the start of each repeat as a condition within the while and exp3 is executed once at the end of each repeat.

For example:

for($i=1;$i<=10;$i++){
echo($i);
}

translates to:

$i=1;
while(i<=10){
echo($i);
$i++
}

This equivalence is exact and you can think of the for loop as just being a shorthand for the while loop that does the same job.

There is nothing wrong with this method of implementing a for loop - there are simpler forms that make it easier for the beginner but PHP was designed with C in mind. The problems start when programmers realise how clever the idea is and decide to use it in ways that take it beyond the simple enumeration loop. We have already seen that:

for(;;;){
instructions;
}

is one way to implement an infinite loop but you could justify this by claiming that an infinite loop is an enumeration loop - it repeats exactly an infinite number of times. However, it is easy to invent versions that are much more complicated and well away from the idea of an enumeration loop. This is made more possible by the fact that not only can you leave out expressions in the for but you can write multiple expression separated by commas.

For example:

for($i=1,$j=2;$i<=5;
print $j,$j=$j+$i,$i++){
}

In principle you can always move the instructions that would normally be in the body of the loop to the last expression. Interestingly there are exceptions to this rule. For example, you can't use echo because this is a built-in command and doesn't behave like a function but you can use print because this is a function and so can be used in an expression. This sort of idiom is mostly bad programming. It may be clever programming but it doesn't achieve anything in terms of clear expression of intent.

Another very common pattern encountered in both PHP and C is the use of a predicate function, i.e. a function that returns true or false within a for loop to terminate the loop. This in turn generalises to the use of functions for all three expressions. For example:

for(init();test();next()){
instructions;
}

where the init function performs initialisation before the loop starts, the test function returns false when the loop should end and the next function moves the loop on to the next repeat in whatever way is appropriate. Whether or not this is good coding depends very much on what the alternative to this approach would be. If the use of functions makes the intention clear and represents an enumeration loop as an advanced for loop then it is desirable.

Loop efficiency

Finally it is worth saying that there are many issues of efficiency associated with loops. If you do something once then generally speaking it doesn't affect the running time of a program. If you do something a thousand or more times how long each repetition takes matters. In other words, when you are trying to make your PHP programs work faster then it's the loops you should spend time looking at.

In the case of the for loop it is generally said that it is important to remember that the second and third expressions are executed each time the loop repeats. This is indeed good advice but it is just a special form of &quot;everthing in the loop body gets repeated&quot;. You should always attempt to move instructions that don't need to be repeated out of the loop and try to avoid re-evaluating expressions within the loop that always have the same value. For example:

for($i=1;$i<=$total+$tax/20;$i++){
$rate=$tax/$total/3.145;
$days=$i*10/$rate;
}

This loop, which is complete nonsense, is unnecessarily inefficient. Each time through the loop the second expression in the for is evaluated:

$i<=$total+$tax/20;

but the calculation on the right is the same each time through the loop as $total and $tax do not change within the loop. A much better way of doing the same job is to calculate the value that depends on $total and $tax just once before the loop starts:

$max=$total+$tax/20;
for($i=1;$i<=$max;$i++){
$rate=$tax/$total/3.145;
 $days=$i*10/$rate;
}
now only $i<=$max is repeated not the entire calculation. The only problem with this approach is to find a suitable name for the variable used to store the temporary result.

If you look at the loop again you can see that $rate is computed each time through the loop and it always works out to the same value because $tax and $total don't change in the loop. So as well as applying the "move things out of the loop" to the expressions in the for instruction we also have to focus on the rest of the loop:

$rate=$tax/$total/3.145;
$max=$total+$tax/20;
for($i=1;$i<=$max;$i++){
 $days=$i*10/$rate;
}

There is one more tiny optimisation that can be made but most programmers would probably leave the loop as written unless they were really serious about saving a few milliseconds. The instruction in the loop multiplies $i by 10/$rate. Instead of doing this division each time through the loop why not move it out of the loop and do it just once:

$rate=$tax/$total/3.145;
$max=$total+$tax/20;
$tenRate=10/$rate;
for($i=1;$i<=$max;$i++){
 $days=$i*$tenRate;
}

Now this is probably about as far as you want to go with optimisation and even this might be a step or two too far because the resulting loop is now far less clear in what it is calculating due to the use of strange temporary variables. You don't have to optimise every loop.

After the basic rule of moving everything can out of the loop there are very few general loop optimisation methods. The only really general one is &quot;loop unrolling&quot; where you simply write the loop out explicitly or make the work done at each repeat a bigger chunk of what is to be done. For example, if you have a function which works with some files you might write:

for($i=1;$i<=1000;$i++){
 processfile($i);
}

If this is working too slowly you could try:

for($i=1;$i<=1000;$i+=5){
 processfile($i);
processfile($i+1);
processfile($i+2);
processfile($i+3);
processfile($i+4);
}

This steps through the files five at a time but notice in most cases the only gain is from the overheads involved in restarting a loop. The second unrolled version of the loop restarts five times less than the original. Usually these gains aren't worth the effort and if you really want to make a loop go faster you generally are in search of a clever alternative algorithm.

Banner


Ten Minutes to PHP

Want to get started with PHP but never found the time? Now you can write your first program in around ten minutes and understand where to go next.



PHP Inner Functions And Closure

PHP inner functions and anonymous functions are a little strange to say the least. However, just because something is strange doesn't mean that it isn't useful. We take a close look at the way PHP fun [ ... ]


Other Articles

<ASIN:0672329166>

<ASIN:0596006810>



Last Updated ( Monday, 22 March 2010 )