DIFFERENT BITS
Example:
Original dataset
70, 15, 111, 32, 56, 70, 15, 70, 15, 7, 1, 3, 5, 20, 67, 54,
42, 29, 113, 7, 1, 3, 5
Compressed dataset
a, 111, 32, 56, a, a, b, 20, 67, 54, 42, 29, 113, b
Key
a = 70, 15
b = 7, 1, 3, 5
The more you can compress the data, the more
patterned or predictable it is, the less information it
contains, and the less random the sequence of numbers is.
By contrast, if the data cannot be compressed at all, “if the
smallest program for calculating it is just as large as it is ...
then the data is lawless, unstructured, patternless, not
amenable to scientific study, incomprehensible. In a word,
random, irreducible!” (Chaitin, p. 64)
So randomness is equivalent to information. The more
information a sequence contains, the more difficult it is to
compress, and the closer it is to randomness. In this light,
white noise becomes pure information and a perfect source
— in addition to games of chance — for generating random
sequences. Anywhere noise is present, for example, radio
waves and the atmosphere, we can tap into it to extract
random sets of numbers.
Radioactivity
Something that we thankfully have rare occasion to
contemplate, radiation is an excellent source of random
numbers. Radioactivity can be defined as “The spontaneous
emission of radiation from the nucleus of an unstable atom.
As a result of this emission, the radioactive atom is converted,
or decays, into an atom of a different element that might
or might not be radioactive.” (Defined by the Radiation
Emergency Assistance Center.) For our purposes, this means
that given an unstable atom, there is no way of predicting
when it will lose its extra nucleon resulting in the radioactive
decay. By proxy, if you have multiple radioactive atoms,
there is also no way of predicting the interval between
decays, and it is exactly this combination of random periods
that can be used to generate random strings of binary.
So, we know that randomness comes from dice
throws, card games, coin tosses, noise, and radioactive
decay, and I think we can safely bet that our computers are
not striking up a game of cards each time we request a
FOR YOUR INFO
For more details on radioactive decay as a source of
randomness, check out Hotbits, a company that supplies
random number sequences sourced from radioactive behavior
via the Internet; www.fourmilab.ch/hotbits/.
68 SERVO 07.2008
random number, so how do they do it?
rand() and srand()
Most likely, if you are reading this magazine you
have also done at least a little programming and have
encountered some version of the rand() function or random
class. Almost every contemporary computer language and
platform has had some variant of this, from microcontroller
C to Java to Python. These functions generate what are
known as pseudo-random sequences of numbers. Why
pseudo-random? Well, for one thing, the random number
generator in your computer is actually just an equation.
Most computer languages (including C and Java) contain
a version of the classic algorithm, the linear congruential
generator, as the source behind rand(). A linear congruential
generator works by computing each successive random
number from the previous, starting with a seed, X0. The
seed is generally what you supply to the algorithm by calling
srand(seed_value) before requesting a number from rand().
Here is the formula:
Xn+1 = (aXn + c) mod m
where Xn is the output set of random numbers
m is the modulus value
a is the multiplier value
c is the increment value and X0 is the initial seed value
(ex. supplied by srand)
In Ansi C, for example, m = 232, a = 1103515245, and
c = 12345. The number returned by the rand function is
actually drawn from only bits 30: 16 of the output result of
the function.
Similarly, Java uses the linear congruential generator
with an a value of 25,214,903,917 and a c value of 11.
Problems
One of the defining characteristics of pseudo-random
number generators is that given the same seed, they will
always produce exactly the same series of numbers as output.
This is very useful when you need a sequence of data that
appears random and is the same each time, for example, in
rendering computer graphics, but it is an annoyance for most
of the applications we have implemented in this column.
The other problem with linear congruential generators
is that they are not terribly random. The longest random
sequence it is possible to generate is the length of the
chosen modulus value (232 in C) before it begins to repeat
from the beginning; a characteristic referred to as the
period of the generator. Additionally, it has a characteristic
called serial correlation which means that there are
predictable patterns in the data. To return to our discussion
of algorithmic information theory above, this means that
the data could be compressed; it is not 100% random.
In comparison, the Python programming language uses