Advanced loops

Written by Administrator

Thursday, 18 March 2010

Article Index
Advanced loops
Mulitple exits
Break and continue
For and efficiency

Page 2 of 4

Multiple exits

Where things do get confusing is when you start to create loops with multiple exit points. This is natural enough when you are first introduced to the break statement because PHP doesn't make the infinite loop obvious and so rather than using break to move the condition and exit into the loop programmers often use it to add another exit to a while, do-while or for loop.

For example:

while(conditionA){
 instructions1;
if(conditionB)break:
 instructions2;
}

Now we have two exit points and two exit conditions. The loop can end because of conditionA or conditionB and in either case control passes to the instruction following the loop.

This is the problem with multiple exit points. When control does reach the instruction following the loop you have no idea which condition caused the loop to end. If the rest of the processing doesn't depend on how the loop ended everything is fine but in most cases you will almost certainly have to perform at least one of the tests again.

There is also the objection that if loops have multiple exit points this makes it much harder to read and understand a program. If you see a simple while loop you know that the condition clearly indicated at the start is the only one that will bring the loop to a close. However, if the style of PHP in use encourages multiple exit points you have to scan through perhaps a long loop and make a list of the exit conditions and where they happen.

This said there are times when multiple exit points are natural and might make the loop simpler. Let's look at the most common example.

The search loop

The most common occurrence of multiple exit points in loops is when you are searching for something in some data. In practice this can become very complicated depending on the nature of the data but for simplicity let's search a string for the occurrence of the first asterisk. Of course you might search, but you also might not find and it is this that makes the multiple exit point appropriate. You want to search the entire string so you want a loop that will examine each of its characters in turn - and this implies a for loop:

for($i=0;$i<strlen($a);$i++){
           ...
};

This loops for $i equal to zero to one less than the length of the string - given by strlen($a). The reason why we want it to be one less than the string length is that the characters that make up a string are numbered from zero. The loop will execute until the end of the string is reached - but wait! If we find the character we are looking for we don't want to carry on looping. Clearly the right thing to do is break out of the loop as soon as we find the target character:

for($i=0;$i<strlen($a);$i++){
  if($a[$i]=="*")break;
};

This works well but, and often it is a big but, when the loop ends it could be that we have found the asterisk or it could be that the loop has just scanned the entire string and given up. Two exit points mean that when we reach the end of the loop we can't be sure why.

To see why this really matters try adding the code to display a message that the asterisk has been found to the end of the loop. You will discover that basically you have to test the string again. Notice that it is generally considered to be bad practice to test the value of the loop index to see if it has exceeded the limit i.e. the length of the string. Many programming languages explicitly state that you can't rely on the value of the index at the end of a for loop - but PHP isn't one of them. The solution is to write something like:

for($i=0;$i<strlen($a);$i++){
 if($a[$i]=="*")break;
};
if($a[$i]=="*") 
       echo("found the asterisk");

Even so it still seems wasteful to have to test for something twice. The alternative is rather worse. It is very tempting to combine the test for the exit condition with some processing as in:

for($i=0;$i<strlen($a);$i++){
 if($a[$i]=="*"){
  echo("found the asterisk");
  break;
 }
}

This is arguably very poor style. It saves a little repetition but confuses the loop with what happens after the loop is over. It is far from clear when you look at the loop that the instructions within the if statement are only executed once and represent what happens when the loop is over.

At all times the aim is to keep loops simple. Use while, do-while and for and only add additional exit points when doing so makes the the resulting code easier to understand.

Nesting

Writing a program is a matter of putting together control structures such as if, switch, while, do-while and for to make a range of paths through the program according to what the program should do. There are two distinct ways to put control structures together. You can place them one after another - or concatenate - them. This is simplicity itself and there is nothing easier than writing an if followed by a loop followed by another if. The second way is called nesting where one control structure is placed entirely within another. That is you can place an if statement inside a loop - see the example in the previous section. Notice that for nesting to work one structure must be completely contained within the other.

If you think of a control structure as a pair of brackets indicating where the structure starts and where it ends e.g. ( ) might represent the start and stop of an if statement then concatenating control statements corresponds to writing strings like ()()()()()() and so on which are comparatively simple - boring even. Nesting on the other hand is when you include one set of brackets within another as in: (()). You can quickly make nested sequences of brackets which look complicated e.g.

(((())()(())))

and as the sequence grows it becomes increasing difficult to check that the nesting is correct and making sense of it becomes harder. This is the reason why programmers are usually told to limit the nesting of control structures. You might wonder how to achieve this and the solution relies on user defined functions to encapsulate complexity - but this is another story.

The most commonly encountered nested control structure is the nested for loop. For example consider that the following set of loops do:

for($i=1;$i<=5;$i++){
  for($j=1;$j<=3;$j++){
    echo($i ." ". $j ."<br/>");
  }
}

Notice the way that the for loops are indented in an attempt to make the nesting clear. What happens if you run the program is that the outer loop repeats for values of $i from 1 to 5. Each time the outer loop repeats the inner loop is obeyed and this repeats for values of $j from 1 to 3. What this means is that the instruction in the inner loop, i.e. the echo in this case, is obeyed for values of $i and $j that run

1,1
1,2
1,3
2,1
2,2
2,3
and so on

Nested loops generate all the combinations of their indices. If you nest three loops you get all combinations of three indices and so on. What if you don't know how many indices you need before the program starts? For example, try and write a PHP program that can generate all combinations the numbers 1 to 10 taken N at a time - where N is specified when the program runs. The simple answer is that you con't easily do it using nested loops - even though you can do it if you know what N is before the program runs. To create a variable number of nested loops you need another advanced technique - recursion.

<< Prev - Next >>

Last Updated ( Monday, 22 March 2010 )

Recent Articles

Recent Book Reviews

Popular Articles

Multiple exits

The search loop

Nesting