In computer science and discrete mathematics a graph is a set of points – called vertices or nodes and a set of connecting lines called paths, arcs or edges. This definition of a graph includes the familiar temperature charts etc but only as a special case. In this sense a graph is anything from a hierarchy chart to a computer network.

*A simple graph of vertices and arcs*

Graphs are important in computer science as a basic tool of theory and they are of practical importance. Before we can go on to explain why we need some additional definitions:

- A path is a route from one vertex to another via arcs.
- If there is a path between every pair of vertices the graph is a "connected graph".
- A path from a given vertex and back again is called a cycle.

*A connected graph showing a path and a cycle*

Now we can define a tree – a tree is a connected graph with no cycles. There are many equivalent definitions of a tree including the one that gives it its name – a root node connected to leaf nodes by branches.

*Check that this is a connected graph with no cycles – aka a tree!*

A slightly more advanced form of graph is the network. A network is a graph which has weights assigned to each of its paths. You can think of each weight as either being a distance or a cost associated with the path. With networks you can ask questions such as "find the shortest path", "find the shortest round trip" and so on.

One important version of this "shortest" type of question is – what is the shortest connector. That is, the sub-graph with the shortest total distance that connects all of the vertices.

It doesn't take much to see the shortest sub-graph that connects all of the vertices is going to be a tree because if it contains any cycles you can get a shorter graph by deleting at least one arc without altering the connectivity. As a result this problem is often called finding the "minimum spanning tree", MST.

Why is the MST important?

Consider the situation that you have to connect three computer sites into a network. All you care about is that every computer is connected and the cost of connection is proportional to distance. In this case the minimum spanning tree is going to be a good starting point for a practical implementation of the network.

Another, slightly more esoteric, example of the usefulness of an MST is that it provides an upper bound to the travelling salesman problem. The travelling salesman's round route has to be shorter than twice the MST for the network.

<ASIN:1584883960>

<ASIN:0201000237>

How do you find a minimum spanning tree given a network? It turns out that there are two general algorithms – Prim's and Kruskal's. Of the two Prim's is the easier to implement and to understand, so it makes a very good starting place to understand any graph algorithm. It is also an excellent example of a "greedy" algorithm, i.e one that always takes the best option at each step, that actually works!

This algorithm was developed in 1930 by Czech mathematician Vojtěch Jarník and independently in 1957 by computer scientist Robert C. Prim and was rediscovered by Edsger Dijkstra in 1959:

0) Place all of the vertices in an "excluded" set and initialise an "included" set to be empty.

1) Select any vertex as the starting vertex of the tree. Place this vertex in the "included" set.

2) Find a vertex which is closest to any vertex in the "included" set. If there is more than one such vertex select one at random.

3) Repeat 1 and 2 until all of the vertices are in the included set and the excluded set is empty.

The result is a minimum spanning tree – as long as you remember to store which path between which pair of nodes was the shortest distance at each step!

The above may seem simple but there are some interesting problems in implementing any graph algorithm using C#.

The first is how are we to represent a graph or network. The simplest solution is to use a two-dimensional array to record the distance between each node. For example:

Network[a,b];

would be used to store the distance between vertex a and vertex b. Notice that this matrix is symmetric and the diagonal is zero because the distance between "a" and "a" is zero.

The next problem is how to generate a network to try the algorithm out on?

In a real world problem the network would be entered as data to the array but it is a common requirement to generate valid examples of any data structure that a program is working with.

In this case the problem isn't difficult in that all we have to do is generate a set of random numbers between particular limits and make sure that the array is symmetrical and has a zero diagonal. This would produce a fully connected network which is difficult to represent graphically. To produce something that had fewer connections per vertex it would be necessary to set to zero entries picked at random.

While on the subject of representing the network graphically it is worth commenting that the distances in the network matrix don't have to be Euclidean distances. For example they might well be costs or subjective judgements. The algorithm works even if the distances aren't Euclidean but you can't draw the vertices in a 2D space with the given distances between them. This is a problem only if we want a to produce a graphical display – but in this case we do!

The solution is to work the other way around. First generate the vertices as random points in a 2D space and then work out the distances between them to store in the network matrix. If we also keep the locations of the points then drawing the network is even easier.

I did consider implementing a fully object-oriented version of the algorithm but this seems to go against the idea of a "classic" algorithm. The object-oriented approach works well when the problem breaks down into natural groupings of methods and property but it is often less good at representing algorithms. In this case the focus is on the algorithm, i.e. exactly how the code implements the idea, and this is best achieved by breaking the algorithm down into functions.

Starting with a new C# WPF project we need to add a button labelled "Generate Network" and a Canvas object. A Canvas object is often a good thing to use when you need to draw using x,y co-ordinates but there are other ways of doing the same job.

Moving to the code. We need an array to hold the co-ordinates of the points:

private Point[] Positions =

new Point[size];

size is a constant that sets the size of the network in number of vertices.

const int size = 10;

The array to hold the network data can be defined as

private Single[,] Network =

new Single[size, size];

We are also going to need a random number generator throughout the program so creating a global object is a good idea:

private Random R = new Random();

We can now write a subroutine that fills both Position and Network with random values:

private void setnet(

Single[,] Net,Point[] Pos)

{

int maxlength = (int)(Math.Min(

canvas1.Width,

canvas1.Height) * 0.9);

int minlength = maxlength / size;

for (int i = 0; i < size; i++)

{

Pos[i].X =

R.Next(minlength, maxlength);

Pos[i].Y =

R.Next(minlength, maxlength);

for (int j = 0; j <= i; j++)

{

Net[i, j] = distance(Pos[i], Pos[j]);

Net[j, i] = Net[i, j];

if (i == j) Net[i, j] = 0;

}

}

}

The points are generated randomly with co-ordinates between minlength and maxlength which are derived from the size of the Canvas object.

Notice that the distance array is made to be symmetric and the diagonal is also set to zero. The distance between points is computed using an additional function.

The distance function is simply the usual Euclidean formula:

private Single distance(Point a, Point b)

{

return (Single)Math.Sqrt((a.X - b.X)*

(a.X - b.X) + (a.Y - b.Y) *

(a.Y - b.Y));

}

Recall that C# has no power operator and simply multiplying things together is better than having to use the Maths object to square a value.

After generating the network it would be nice to view it. A drawing routine is quite easy – as long as you don't worry too much about efficiency. We can simply use Line objects to draw lines on the Canvas.

First we clear the Canvas of any child nodes it may already contain:

private void shownet(Single[,] Net)

{

canvas1.Children.Clear();

Next we create a new Line object for each pair of points that have a non-zero distance between them:

for (int i = 0; i < size; i++)

{

for (int j = 0; j < i; j++)

{

if (Net[i, j] != 0)

{

myLine = new Line();

myLine.Stroke = Brushes.Black;

myLine.X1 = Positions[i].X;

myLine.X2 = Positions[j].X;

myLine.Y1 = Positions[i].Y;

myLine.Y2 = Positions[j].Y;

myLine.StrokeThickness = 1;

canvas1.Children.Add(myLine);

}

}

}

Next we can plot suitable marker shapes at the position of each node:

Rectangle myMarker;

for (int i = 0; i < size; i++)

{

myMarker = new Rectangle();

myMarker.Stroke =Brushes.Black;

myMarker.Fill = Brushes.Red;

myMarker.Height = 10;

myMarker.Width = 10;

myMarker.SetValue(

Canvas.TopProperty,

Positions[i].Y - myMarker.Height / 2);

myMarker.SetValue(

Canvas.LeftProperty,

Positions[i].X - myMarker.Width / 2);

canvas1.Children.Add(myMarker);

}

}

Notice the way the that the position of the rectangle is set using an attached property on the Canvas object. This is a feature of WPF that beginners often find difficult. Attached properties are a good idea from the implementation point of view but they often don't help with readability and clarity. You might also ask why the Line object sets its own position without the need for an attached property and yet most of the other WFP shapes can't.

To try these out place a button on the form, if you already haven't and enter the following as the button's click routine:

private void button1_Click(

object sender,

RoutedEventArgs e)

{

canvas1.Width = 600;

canvas1.Height = 600;

setnet(Network, Positions);

shownet(Network);

}

You can now try the program out and see if it really does generate reasonable graphs. It should! You can size the canvas control to make best use of the space and the graph will adjust itself to fill it, more or less. It is fun just to see the range of graphs that can be produced.

*A typical random graph with ten vertices*

<ASIN:1565924533>

<ASIN:1584883960>

Now we have to move on to implement Prim's algorithm proper. As well as just implementing the method, an animated display will be used to show the paths as they are being considered for inclusion. Place a second button on the form and change its caption to "Find MST". It's click routine is simply

private void button2_Click(

object sender, RoutedEventArgs e)

{

prims();

}

The work is all done by the prims function.

The first thing we need are two arrays to store the list of vertices that have been included and are still to be processed:

void prims()

{

int[] included = new int[size];

int[] excluded = new int[size];

We also need another network array to hold the distances that form the minimum spanning tree:

Single[,] finished =

new Single[size, size];

Two integer variables are used to hold the start and finish node numbers of each path added to the tree:

int start = 0;

int finish = 0;

The two arrays need to be initialised:

for (int i = 0; i < size; i++)

{

excluded[i] = i;

included[i] = -1;

}

At the start of the algorithm all of the vertices are excluded, i.e. vertices 0 to size-1, and none of the vertices are included, i.e. all entries -1. Notice that -1 is used to flag the fact that there is no entry in the array.

The first point in the MST is selected at random:

included[0] = excluded[R.Next(size)];

excluded[included[0]] = -1;

Notice that the vertex that has been added to "included" has to be removed from "excluded" by setting that element to -1.

At last we can start the loop that finds the closest vertex to the currently included set and adds it to "included".

for (int n = 1; n < size; n++)

{

closest(n, ref start, ref finish,

included, excluded);

where *start* and *finish* are the indices of the vertices of the path that is to be added to the tree. That is *finish *is the closest of the currently excluded vertices and *start* is the vertex in the included set it is closest to - adding the arc from *start *to *finish *to the MST.

To be clear, the function closest finds the closest vertex listed in the excluded array to any of the vertices listed in the included array.

The end point of the path is the new vertex that has to be added to the included set, i.e the vertex indexed by *start *is already in the included array:

included[n] = excluded[finish];

and hence removed from the excluded set:

excluded[finish] = -1;

The distance associated with the path also has to be added to the "finished" array to indicate that it has to be drawn as part of the MST:

finished[included[n], included[start]] =

Network[included[n], included[start]];

and the finished array has to be made symmetric just in case someone wants to make use of the upper triangle rather than the lower:

finished[included[start], included[n]] =

Network[included[start], included[n]];

}

and finally we can show the finished MST:

shownet(finished);

}

All that is needed to complete the program is the "closest" function:

private void closest(int n,

ref int start, ref int finish,

int[] included, int[] excluded)

{

Single smallest = -1;

for (int i = 0; i < n; i++)

{

for (int j = 0; j < size; j++)

{

if (excluded[j] == -1) continue;

if (smallest == -1) smallest =

Network[included[i], excluded[j]];

if (Network[included[i], excluded[j]]

> smallest) continue;

smallest =

Network[included[i], excluded[j]];

start = i;

finish = j;

}

}

}

This is essentially a search loop for the smallest value but it has to take into account whether the vertex is valid or not. A vertex that isn't in the excluded list isn't valid and one with a zero distance isn't connected to the node in question and once again isn't valid. Apart from this complication each node in the excluded list is used to find the smallest distance.

If you put all of this together and run the program you will discover that not only can you generate random graphs but also their MSTs.

You can learn a great deal by watching it. If you press the "Find MST" button again the whole process is repeated from a different starting point. Notice that the MST isn't unique so if you discover that starting from a different vertex gives you a different MST this isn't an error.

*A minimum spanning tree.*

You can add additional facilities to this program quite easily. For example, a display of the total length of the MST would be interesting and a slider control to modify the number of arcs would also be worth the effort. You could also animate the display to show the arcs being selected one at a time.

If you would like the code for this project then register and click on CodeBin.

{loadposition moreprojectsART}

<ASIN:1584885491>

]]>