Inside Random Numbers

Written by Mike James

Sunday, 19 February 2023

Article Index
Inside Random Numbers
Software Randomness
The Congruential Generator

Page 3 of 3

The Congruential Generator

So what method do we use that satisfies the conditions?

The answer in nearly all cases today is the “congruential random number generator”, CRNG.

All that happens is that you pick a starting value – the seed – and a number to use as the generator.

To get the next random value you multiply the seed by the generator and take the answer mod “a large value”.

Mod is a function that simply rolls over at a given value. For example 7 Mod 6 is 1 because we roll over to 0 at 6.

In the case of a general CRNG the calculation is,

a[i+1]=(K*a[i]+C) MOD P

where K is the generator and a[i+1] is the next random number. P is usually chosen to be a large power of 2 to make the arithmetic easy and K is picked, using some basic requirements and hand tuning to give a good pseudo random sequence.

The first value a[0]is called the seed and is used to start the sequence off. The random number generator in C, rand is of this type and you can set its seed using the srand function. If you don’t call srand before you use rand then the seed is set to one. Notice that this means you get the same set of random numbers every time you use rand with the same seed.

If you don’t specify a seed then 1 is used and rand produces a sequence that is so well known it is listed and recognized by The On-Line Encyclopedia of Integer Sequences. It may have random properties but it is still well known and anyone recognizing it can easily give you any values in the sequence in any order.

The resulting value is suitably pseudo random as long as K is well chosen and it is difficult to predict the next value given any number of earlier values.

In fact given a small sample of the output of a CRNG it is possible, by solving some linear equations, to work out the generator, and hence predict the entire sequence but for most purposes this doesn’t matter. That is it is very easy to work out the values of K, C and P given only three consecutive values. In a security application it most certainly matters but if you are just using the random numbers for simulation or for games then the fact that a complex mathematical procedure can be used to predict the next number isn't relevant.

Congruential generators are good but they have some well known problems. The biggest is that the low order bits tend to be periodic. The solution is to use the most significant bits first i.e. shift right until you have a number in the correct range. They also eventually periodic and they have correlations between elements.

Linear-feedback Shift Registers

Another good method of creating pseudo random numbers is the Linear-feedback Shift register LFSR. This is something every hardware enthusiast and programmer should know about as it is a simple approach to generating noise-like sequences in hardware. The basic idea is that a shift register is used in combination with some XOR gates to generate a sequence. There are two possible arrangements.

The first is usually called a Fibonacci LFSR:

LSFR1

When the register is right shifted the XOR gates provide the input to the first bit (1 XOR 0) XOR 1 = 0 in this case. The output of the register this the least significant bit shifted out. You can vary the positions, the taps, at 4, 6 and 8 that are combined with the output to create different sequences. With a good selection of taps a reasonable pseudo random sequence can be produced. You can also implement the same scheme using a shift left.

The alternative implementation is the Galois LFSR:

LSFR2

In this case we number the states opposite to the direction of the right shift and the tap points are where the XOR feeds a value in. This LFSR has the same tap points as the previous example, 4, 6 and 8 and so it produces the same sequence but offset. An alternative is to simply use a left shift:

LSFR3

The Galois arrangement is easier to implement in software.

There is a mathematical correspondence between an LFSR and a binary polynomial. The positions of the taps are the powers in the polynomial so an LFSR with taps at 8, 6 and 4 give the polynomial:

x⁸ + x⁶ + x⁴ +1

The one is always included and it corresponds to the input bit i.e. x⁰. The properties of the sequence produced can be related to the mathematical properties of the polynomial. An LFSR gives a maximal length cycle if and only if its polynomial is primitive which implies that the number of taps has to be even and they have to be co-prime i.e. not have a common divisor other than 1. All of the primitive polynomials for different numbers of bits have been tabulated.

You can also specify an LFSR by the binary number with one in each of the tap positions. So the LFSR with taps at 4,6, and 8 corresponds to 10101000 or 0xA8. This can be used to implement a Galois LFSR using a left shift in C:

byte = (byte << 1)  ^ (byte & 0x80u ? 0xA8u : 0);

This works because if the output bit is a one you want to XOR the shifted value by 0xA8 which has a one in each tap position and if it is a zero you can XOR the shifted value by zero which has a zero in each tap position.

For an 8 bit shift register a primitive polynomial corresponds to 0xB4.

Functions for random numbers

This is the method used by nearly all programming languages. For example, most languages provide something like a rnd() function which returns a random number, a decimal fraction, in the range 0 to less than 1.

In practice the rnd function generally uses a CRNG to generate integers which it then divides down to give the decimal fraction. Each time you use rnd() the current random value is used in the CRNG to generate the next.

One interesting and useful extra is that there is usually something like randomize(Seed) which can be used to set the initial starting value or seed for the CRNG.

You can use this to generate a repeatable set of random numbers when a game or simulation is being tested because after setting the seed you always get the same sequence of values – a fact which makes many people give up on the entire concept of “random” numbers!

This repeatability may be useful in some situations but by default the rnd function often always starts from the same seed. What this means is that when you start a card game you will always be dealt the same hand!

Clearly this isn’t good and the solution takes us back to the first method suggested for generating random numbers – the timer. There is usually a command which will use some physical value that is unpredictable to seed the generator - usually the current value of the system timer, i.e. the time since the machine was switched on. This usually results in a new sequence of numbers being produced that look as if the deck has been well and truly shuffled.

If you keep in mind that pseudo random numbers are mostly about not being predictable rather than worrying about any deep philosophy of what randomness is then it all becomes much simpler.

One parting thought - if you understand the idea of a pseudo random number then don't be fooled into thinking that creating a random number generator is easy. Even if you think you have found an external source of entropy that you can use you need to know that most such ideas don't work. You need to test the numbers to confirm that they are statistically random.

Creating a good random number generator is difficult.

And finally back to that cartoon.

xkcd

Can one number be random?

The simplest answer is that random is a term that really only applies to a sequence of numbers - a single number is just a number.

dice

Credit: Gringer

Mod function

The Monte Carlo method

Advanced Hashing

How not to shuffle - the Knuth Fisher-Yates algorithm

The Programmer's Guide to Chaos

ERNIE - A Random Number Generator

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Comments

or email your comment to: comments@i-programmer.info

Data Structures Part II - Stacks And Trees

Part II of our look at data takes us into more sophisticated structures that are fundamental to computing - stacks, queues, deques and trees. If you don't know about these four then you are goin [ ... ]

+ Full Article

IP Addressing and Routing

Every programmer should understand how the Internet works and this means understanding IP addressing and routing. It's a good time to find out about such things with DOS attacks on the rise and IPv6 t [ ... ]

+ Full Article

The Congruential Generator

Linear-feedback Shift Registers

Functions for random numbers

Related Articles

Comments