Page 3 of 5
The answer to these questions depends on whether or not the pivot is an actual value in the array. It is usual to pick a pivot value that is in the array and this leads to the classical version of the algorithm that uses the pivot as a “sentinel” which stops both scans falling off the end of the array.
Simple just make sure that neither the left or the right pointer can pass the pivot value.
So the algorithm for the left pointer L is
And for the right pointer it is
As long as pivot is in the list of values we can be sure that both loops end because in this case for some L, A(L)=pivot and for some R A(R)=pivot and hence both loops come to rest on the pivot.
The scan and swap
So both loops scan along until they find a value in the wrong place and then they swap the values over.
Two questions - what happens when one of the scans hits the pivot and how do the scans end?
The important thing is that for the pivot to be handled correctly the scan pointers must not be updated before the loops are started again. That is if you try
Then the update will be correct for all values that are swapped except when one of them is the pivot.
The reason is that for all values except the pivot the swap places them in the correct position never to be moved – but the pivot isn’t often moved to the correct location in one go.
For example, if the L pointer reaches the pivot it will be swapped with whatever value the R pointer has reached and will be swapped. This gives a value at the L pointer which is less than the pivot – which is what is required - but at the R pointer the new value isn’t greater than the pivot (it is the pivot!) and so may have to be moved. The R pointer has to stay put on the pivot while the L pointer searches to see if it can find any values larger than the pivot to swap with it.
Notice that in this version of the scan we are actually partitioning the array into three parts – values less than the pivot, the pivot and values larger than the pivot.
The pointers meet at the pivot and divid the array into three parts
This three-part division allows us to guarantee that the next stage always has a smaller array to sort. If the array was divided into two parts – values smaller than or equal to the pivot and values larger than the pivot this would not always be the case.
Imagine picking the largest value in the array for the pivot! In this case the left-hand portion would be the whole array and the right hand portion would be null. You’d simply end up calling the QS algorithm on the entire array again.
In the three-part division, selecting the maximum or minimum as the pivot, which can happen by chance, still reduces the array to be sorted by one because the pivot itself works its way to the top or bottom of the array and doesn’t play any further role in the sorting. That is, when the pointers meet and L=R the array to be sorted is always L-1 and R+1 because the pivot value is now stored at data(L)=data(R) and never has to be moved again.
Subtle isn’t it!
Great and it all works – well in most cases.
This is the classical solution to the scan and swap algorithm but it doesn’t always work. It only works if there is exactly one pivot value in the array. If there are two or more pivot values then the routine will simply hang in the infinite loop swapping the equal pivot values backwards and forwards.
I have seen this version of the algorithm presented without any comment concerning when it works or its validity. Be warned there are versions of the QS algorithm out there that not only assume that the pivot is in the list of values they also assume that the pivot is in the list exactly once.
So what is the solution?