Page 1 of 3
Finite state machines may sound like a very dry and boring topic but they reveal a lot about the power of different types of computing machine. Every Turing machine includes a finite state machine so there is a sense in which they come first. They also turn out to be very useful in practice.
The simplest type of computing machine that is worth considering is called a ‘finite state machine’.
As it happens, the finite state machine is also a useful approach to many problems in software architecture, only in this case you don’t build one you simulate it.
Essentially a finite state machine consists of a number of states – finite naturally! When a symbol, a character from some alphabet say, is input to the machine it changes state in such a way that the next state depends only on the current state and the input symbol.
Notice that this is more sophisticated than you might think because inputting the same symbol doesn’t always produce the same behaviour or result because of the change of state.
- The new state depends on the old state and the input.
What this means that the entire history of the machine is summarized in its current state. All that matters is the state that it is in and not how it reached this state. Before you write off the finite state machine as so feeble as to be not worth considering as a model of computation it is worth pointing out that as you can have as many states as you care to invent the machine can record arbitrarily long histories. All you need is a state for each of the possible past histories and then the state that you find the machine in is an indication of not only its current state but how it arrived in that state.
Because a finite state machine can represent any history and a reaction, by regarding the change of state as a response to the history, it has been argued that it is a sufficient model of human behaviour i.e. humans are finite state machines.
If you know some probability theory you will recognize a connection between finite state machines and Markov chains. A Markov chain sums up the past history in terms of the current state and the probability of transition to the next state only depends on the current state. The Markov chain is a sort of probabilistic version of the finite state machine.
Representing Finite State Machines
You can represent a finite state machine in a form that makes it easier to understand and think about.
All you have to do is draw a circle for every state and arrows that show which state follows for each input symbol.
For example, the finite state machine in the diagram below has three states. If the machine is in state 1 then an A moves it to state 2 and a B moves it to state 3.
A three-state finite state machine
This really does make the finite state machine look very simple and you can imagine how as symbols are applied to it how it jumps around between states.
What is the point of such a simple machine?
There are two good reasons for being interested in finite state machines. The first is practical. As mentioned earlier, there are some practical applications which are best modelled as a finite state machine.
For example, many communications protocols, such as USB can be defined by a finite state machine’s diagram showing what happens as different pieces of information are input. You can even write or obtain a compiler that will take a finite state machine’s specification and produce code that behaves correctly.
Many programming problems are most easily solved by actually implementing a finite state machine. You set up an array or other data structure which stores the possible states and you implement a pointer to the location that is the current state. Each state contains a lookup table that shows what the next state is given an input symbol. When a symbol is read in your program simply has to look it up in the lookup table and move the pointer to the new state.
The practical uses of finite state machines is reason enough to be interested in them. Every programmer should know about finite state machines and shouldn't be afraid of implementing them as solutions to problems.
However the second good reason is perhaps more important - but it does depend on your outlook. Finite state machines are important because they allow us to explore the theory of computation. They help us discover what resources are needed to compute particular types of problem. In particular finite state machines are deeply connected with the idea of grammars and languages that follow rules.
If you define two of the machine’s states as special – a starting and a finishing state – then you can ask what sequence of symbols will move it from the starting to the finishing state.
Any sequence that does this is said to be ‘accepted’ by the machine. Equally you can think of the finite state machine as generating the sequence by outputting the symbols as it moves from state to state. That is a list of state changes obeyed in order, from the start to the finish state, generates a particular string of symbols. Any string that can be generated in this way will also be accepted by the machine.
The point is that the simplicity or complexity of a sequence of symbols is somehow connected to the simplicity or complexity of the finite state machine that accepts it.
So we now have a way to study sequences of symbols and ask meaningful questions.
As a simple example consider the finite state machine given earlier with state 1 as start and state 3 as finish – what sequences does it accept? Assuming that A and B are the only two symbols available, it is clear from the diagram that any sequence like BABAA is accepted by it.
A finite machine accepts a set of sequences
In general the machine will accept all sequences that can be described by the computational grammar
1) <null> -> B<S1>|A#
2) <S1> -> A<S2>
3) <S2> -> B<S1>|A#
The only new features are the use of <null> to specify the starting state and the use of # to specify the final state. You can have many hours of happy fun trying to prove that this grammar parses the same sequences as the finite state machine accepts.
To see that it is it does just try generating a sequence:
- start with <null> and apply rule 1 to get B<S1>
- use rule 2 to get BA<S2>
- use rule 3 to get BAB<S1>
You can carry on using rule 2 and 3 alternately until you get bored and decide to use the A# alternative of rule 3 giving something like BABABABAA#.