Reproducible Randomness
January 24, 2018
Reproducible Randomness
Random numbers are generated by a computer either by taking some form of input that is truly random (for instance, getting the user to move the mouse cursor or to type in several characters), or generated by an algorithm that is pseudo-random.
NumPy generates random numbers through a pseudo-random algorithm, and this allows us to generate random numbers that are reproducibile between runs. This property can be very useful in a machine learning workflow.
In NumPy, this is done by setting the random module’s seed value.
Import Libraries
import numpy as np
Without setting the seed’s value, the outcomes of each list will be different across executions.
print('List 1: ' + str(np.random.randint(5, size=10)))
print('List 2: ' + str(np.random.randint(5, size=10)))
List 1: [1 4 3 0 0 2 2 1 3 3]
List 2: [2 3 3 0 2 4 2 4 0 1]
After setting the seed’s value, both lists will contain the same values.
np.random.seed(42)
print('List 1: ' + str(np.random.randint(5, size=10)))
print('List 2: ' + str(np.random.randint(5, size=10)))
List 1: [3 4 2 4 4 1 2 2 2 4]
List 2: [3 2 4 1 3 1 3 4 0 3]
np.random.seed(42)
print('List 1: ' + str(np.random.randint(5, size=10)))
print('List 2: ' + str(np.random.randint(5, size=10)))
List 1: [3 4 2 4 4 1 2 2 2 4]
List 2: [3 2 4 1 3 1 3 4 0 3]