Introduction To The Genetic Algorithm
Written by Mike James   
Thursday, 18 February 2021
Article Index
Introduction To The Genetic Algorithm
Fitness
Some Examples
Learning to walk

Fitness of a population

First we need a population of units. In this case these are just integer values in the range 0 to 64. That is each unit carries with it a single integer. 

We start off with a population of units with values chosen at random.

The fitness of each unit is simply the value of x squared that it produces i.e. its value squared.

The probability that any particular x value will be selected for breeding is given by its fitness divided by the total fitness of the population. In genetic algorithm applications it is common that the survival rate it scaled by the population's total fitness. 

For example, suppose we have an initial population of just three units with values {3,5,9}

The raw fitness of each unit is simply the square of their value:

{3*3,5*5,9*9}

or

{9,25,81}

 

Then the total fitness is

3*3+5*5+9*9=9+25+81=115

and hence the probability of each unit being selected for breeding is 9/155=0.08, 25/115=0.23, 81/115=0.70 respectively i.e.

{0.08,0.23,0.70}

You can already see that the larger numerical values have a higher probability of breeding which seems to correspond to the common sense idea that offspring should inherit from units that are closer to the maximum. 

Genes and crossover

The next problem is what to use as the genes and how to mix them to create new units.

The simplest idea for a gene in this case is a binary representation of the x value itself.

That is the three genes corresponding to the units in {3,5,9} are {[00011], [00101] ,[01001]}.

In the genetic algorithm, genes often take the form of binary strings. One reason for this is that it makes breeding easier. For example, in this case breeding can be implemented by simply selecting two units at random according to their fitness. Next given two gene strings we select a “cross over” point in each string at random and swap the corresponding portions of the strings.

You can think of this as cutting each gene into two fragments and joining the fragments, one from each gene, back together.

fig1

Gene crossover – a natural biological phenomena?

 

For example if the two last two units are chosen then the genes are [00101] and [01001]. If the random cross over point is 3 (from the right) then the new genes passed on to the pair to the offspring are:

Original Cut fragments New genes
[00|101] [00] [101] [01|101]
[01|001] [01] [001] [00|001]

You can see that the 00 and 01 at the start of each gene swap places to give the new gene. In this case the new members of the population correspond to x=13 and x=1 i.e. one much fitter offspring and one spectacularly unfit offspring.

This is how genetic algorithms work in general and not every offspring of “fit” units is necessarily “fit”.

That really is all there is to it.

Banner

Running the algorithm

Now you simply run the algorithm replacing each pair that you choose to breed by their two offspring and keep monitoring the average and maximum fitness of the population.

If you actually try this out you will find that the population average fitness really does slowly converge to the optimum value of x=64. Of course this isn’t too surprising as with only 64 possible types of unit almost any search method should find the fittest values in a few tries. What is more impressive is that similar problems with much larger search spaces also converge.

The Schema Theorem

There is a theorem, “The Schema Theorem”, which gives some sort of confidence that a genetic algorithm will actually improve the population fitness but there is no proof that it will find the best solution.

In fact there is a great deal of evidence that the genetic algorithm doesn’t produce the best solution in general but the most “robust”. A robust solution is one that isn’t sensitive to small changes in its specification. That is, a robust solution is a practical solution in that you can manufacture it and it will work well even if it isn’t produced exactly to specification.

I suppose the important question is whether or not the genetic algorithm is better than other forms of randomly searching for better units.

This is a difficult question to answer but genetic algorithms do have advantages.

Apart from being simple to implement they also make use of a parallel search of the parameter space. They also have the strange property of increasing “good” bit patterns in the genes at an exponential rate. That is, if particular patterns of bits in the gene are associated with a high level of fitness then the genes in the population very quickly acquire these patterns.

The “crossover” method of breeding is designed to make sure that this happens. In our example of maximizing the square of the value the pattern or “schema” [1****], where the asterisk means a zero or a one, is associated with a high degree of fitness. As the genetic algorithm runs the percentage of units with the schema [1****] increases rapidly. The next best schema is [*1***] and this increases in number rapidly and so on until all of the population has [11111] for their genes!

Going further

After this basic introduction to the genetic algorithm you might well be tempted to go and find out more.

If you do then be prepared to be swamped by biological jargon and countless variations on the basic theme. You will, for example, discover variations on how the genes are combined in reproduction. Many genetic algorithms treat genes as bit patterns that wrap round to make circles and reproduction occurs by selecting two cut points and swapping the smallest part of a pair of “gene rings”.

 

fig2

Circular genes make crossover easier

 

You will also find methods that attempt to make the genetic algorithm more true to the real workings of genetics – recessive genes, diploidy and dominance – in most cases these are reasonable but mostly unnecessary. The fact of the matter is that the basic genetic algorithm works well enough and anything you do to tinker with it makes little difference. What matters is a good coding of the units’ characteristics as genes, the size of population and how many generations you are prepared to let it run to. It is probably worth saying that genetic algorithms have proved useful in fields as widely separated as breeding robots, electronic circuits, optimising routes, breeding investment strategies and so on.

 

fig3

A real genetic algorithm program – WinGA-in action



Last Updated ( Thursday, 18 February 2021 )