Reproducible Randomness

January 24, 2018

Reproducible Randomness

Random numbers are generated by a computer either by taking some form of input that is truly random (for instance, getting the user to move the mouse cursor or to type in several characters), or generated by an algorithm that is pseudo-random.

NumPy generates random numbers through a pseudo-random algorithm, and this allows us to generate random numbers that are reproducibile between runs. This property can be very useful in a machine learning workflow.

In NumPy, this is done by setting the random module’s seed value.


Import Libraries

import numpy as np


Without setting the seed’s value, the outcomes of each list will be different across executions.

print('List 1: ' + str(np.random.randint(5, size=10)))
print('List 2: ' + str(np.random.randint(5, size=10)))
List 1: [1 4 3 0 0 2 2 1 3 3]
List 2: [2 3 3 0 2 4 2 4 0 1]


After setting the seed’s value, both lists will contain the same values.

np.random.seed(42)
print('List 1: ' + str(np.random.randint(5, size=10)))
print('List 2: ' + str(np.random.randint(5, size=10)))
List 1: [3 4 2 4 4 1 2 2 2 4]
List 2: [3 2 4 1 3 1 3 4 0 3]
np.random.seed(42)
print('List 1: ' + str(np.random.randint(5, size=10)))
print('List 2: ' + str(np.random.randint(5, size=10)))
List 1: [3 4 2 4 4 1 2 2 2 4]
List 2: [3 2 4 1 3 1 3 4 0 3]
comments powered by Disqus