|The Genetic Algorithm|
|Written by Mike James|
|Wednesday, 16 March 2011|
Page 4 of 4
That's all there is to it apart from some cosmetic details. We need a method to display the results in a multiline textbox:
void printPopulationStats(int gen,
and we need a "main program" to make it all happen in a button's click event handler:
private void button1_Click(object sender,
If you try it out you will see the current best fitness rating and the total fitness of the population.
One of the things that worries people when they first use the genetic algorithm is that it often converges onto a solution that isn't the best seen so far. This sub-optimal performance is part of the characteristics of the genetic algorithm, remember it is about breeding good but not necessarily the best solutions to a problem. In order to increase the probability of the optimal solution making it to future generations you should increase the size of the population. If you change the size of the initial and breeding population to 100 or even 1000 you will discover that it takes longer to settle down but suboptimal solutions are less common.
After you have played with the program for a while you might feel inclined to try to improve it. At the simplest level you can improve the form of the display of the progress of the generations - there is certainly scope for graphics.
At a more theoretical level you can change things so that a crossover operation has a specific probability of occurring, i.e. some genes go through to the next generation unchanged. You can add a mutation operator that changes individual bits in a string at random.
A more difficult task is to introduce a scaling of the fitness values so that they are closer together in early generations and more widely spread in later generations. The reason for the need for such scaling becomes apparent if you examine the variation of fitness values in practice. At first they tend to be widely spaced and so exceptional individuals quickly come to dominate in future generations. This means that in later generations there is little difference in fitness values and so convergence is slow. Scaling the fitness values mainly has the effect of avoiding overly rapid convergence to an incorrect solution.
If you know anything about genetics you might be wondering why we have restricted the genetic algorithm to haploid i.e. single stranded chromosomes when most of natures complex organisms are diploid i.e use double stranded chromosomes?
Diploid reproduction has more possibilities because the genes occur in pairs and one will be dominant and the other recessive. A recessive gene only affects the system if it is paired with another recessive gene. Part of the reason for not using a Diploid based genetic algorithm is that we are not entirely sure why nature chooses to use it. It seems to be something to do with protecting genes that were once useful from the current evolutionary pressure that would otherwise destroy them. In other words Diploid genes provide a sort of genetic memory of what worked in times past.
This means that if you are considering a stationary problem there is possibly no advantage to be had over a haploid based algorithm. However, if you are trying to solve the blind knapsack problem, which suffered periodic shifts in its values, then a Diploid algorithm might be better.
As you can tell there is a lot of work yet to be done in the field of genetic algorithms. In particular the use of reordering operators to alter the coding of a problem to make the algorithm more efficient looks promising.
Whatever you make of the genetic algorithm at least you now understand the important of including the term suggested by the cartoon in ALL of your fitness functions!
More cartoon fun at xkcd a webcomic of romance,sarcasm, math, and language
|Last Updated ( Wednesday, 24 April 2013 )|