The Genetic Algorithm
Written by Mike James
Saturday, 23 July 2022
Article Index
The Genetic Algorithm
Knapsack problem and genes
Evaluating fitness
Cosmetic details and improvements

## The blind knapsack problem

The knapsack problem is very easy to explain. You are presented with n objects each weighing a given, usually different ,amount and have to select a group of objects that gets closest to, but without exceeding, the total weight limit for the knapsack.

In other words the knapsack problem is about efficiently packing a knapsack so that it can just still be carried. These days the knapsack problem might be better called the luggage allowance problem!

If you know the weight of each item and the weight limit then there are fairly standard methods of selecting the items to take. The blind knapsack problem is more  difficult because you are not told the weights or the weight limit just how efficient any particular packing of the knapsack is.

You can pick a group of objects and some all-seeing observer will tell you how close to the limit you are. In the light of this information you can reject objects that you have selected and select new ones in an effort to improve your packing. Of course this isn't as easy as it sounds because you don't know the weights of any of the items!

## A gene for packing

The blind knapsack problem lends itself to a genetic algorithm solution because it is very simple to construct a gene that corresponds to a particular packing of the knapsack.

If each object is represented by a bit in a bit string, 1 for packed and 0 for not packed then the genetic algorithm can be applied to a population of such strings.

For example, the string 10100 means pack the fifth and third object and leave all the others out (counting the bits from right to left).

Once you have a suitable representation for the problem the next requirement is for a measure of fitness of each member of the population. In the case of the blind knapsack problem this is simply the weight of the packing which can be conveniently expressed as a percentage of the weight limit.

So for example if the fitness of 10100 is 50% this means that 50% of the weight allowance is unused. The only problem is that this measure of fitness doesn't take into account packings that go over the weight limit. For example, the packing 11111, i.e. take everything, might use 870% of the weight allowance but this is not a suitable measure of its fitness because it is well over the limit.

The simplest solution to this problem is to zero the fitness of any packing over the limit. In other words any genes for overweight packings will never make it to the next generation!

In more general problems it might be necessary to reduce the measurement of fitness of genes that lead to illegal results by an amount proportional to the degree that they break the rules (e.g. how much over the weight limit they are). The reason is that if the problem is difficult then it might be hard to find a legal result, let alone a best legal result. In this case any information about how well you are doing even, if the results are not actually acceptable, can be used to select for genes that work their way toward a legal result.

## The genetic algorithm applied

The stages of the genetic algorithm are:

1. generate a population of genes at random
2. evaluate the fitness of each member of the population
3. select members from the population with a probability proportional to their fitness for a breeding pool
4. select pairs from the breeding pool at random and perform a cross over operation at random to give the new population

Repeat stages 2 through 4 until you have a satisfactory population of solutions.

The details of how to implement the genetic algorithm are easy enough,

The implementation language is C# but it should be easy to convert the code to any language you care to choose.

If each gene is stored as a string of 0 and 1 characters then a population can be stored in a string array. This may not seem like a sensible choice for this problem - for example why not use a binary array - but it generalizes to genes that have more symbols than 0 and 1.

The population is stored in the string array population, the corresponding weights in weight, and the fitness of each individual in fitness:

`Random Rnd= new Random();double[] weight;String[] population;double[] fitness;Boolean run = false;`

We also need a random number generator and a run indicator.

No attempt has been made to take an object-oriented approach.

`private double[] makeProblem(int n)  {             //generate a knapsack problem //at random double[] weight = new double[n]; for (int i = 0; i < n; i++){  weight[i] = Rnd.NextDouble() * 50;     }    return weight;        } `

Method makePop  generates an initial random population of genes using makeGene to randomly generate a single gene of length n.

`private String[] makePop(int p, int n){ //make a random population of //n bit strings //total size of population //is p String[] pop = new String[p]; for (int i = 0; i < p; i++) {  pop[i] = makeGene(n); } return pop;}`

makeGene is simply a matter of putting together a random string of zeros and ones:

`private String makeGene(int n){ //make a random n bit  StringBuilder g = new StringBuilder(""); char c; for (int i = 0; i < n; i++) { c = '0'; if (Rnd.NextDouble() > .5) { c = '1'; } g.Append(c); } int t = g.Length; return g.ToString();}`

Last Updated ( Saturday, 23 July 2022 )