Random Number Generation¶
Current best practice for reproducible science in machine learning — including, but not limited to, recommender systems — is to use fixed random seeds so results can be reproduced precisely. This is useful both for reproducing the results themselves and for debugging.
To test for seed sensitivity, the entire experiment can be rerun with a different random seed and the conclusions compared.
LensKit is built to support this experimental design, making consistent use of
configurable random number generators throughout its algorithm implementations.
When run against NumPy 1.17 or later, it uses the new numpy.random.Generator
and numpy.random.SeedSequence
facilities to provide consistent random
number generation and initialization. LensKit is compatible with older versions
of NumPy, but the RNG reproducibility logic will not fully function, and some
functions will not work.
Note
For fully reproducible research, including random seeds and the use thereof, make sure that you are running on the same platform with the same verions of all packages (particularly LensKit, NumPy, SciPy, Pandas, and related packages), and are using at least NumPy 1.17. LensKit manages state for older versions of NumPy on a besteffort basis.
Developers using LensKit will be primarily intrested in the init_rng()
function, so they can initialize LensKit’s random seed. LensKit components using
randomization also take an rng
option, usually in their constructor, to set
the seed on a peroperation basis; if the script is straightforward and performs
LensKit operations in a deterministic order (e.g. does not train multiple models
in parallel), initializing the global RNG is sufficient.
Developers writing new LensKit algorithms that use randomization will also need
pay attention to the rng()
function, along with derivable_rng()
and derive_seed()
if predictions or recommendations, not just model
training, requires random values. Their constructors should take a parameter
rng_spec
to specify the RNG initialization.
Seeds¶
LensKit random number generation starts from a global root seed, accessible with
get_root_seed()
. This seed can be initialized with init_rng()
.

lenskit.util.random.
init_rng
(seed, *keys, propagate=True)¶ Initialize the random infrastructure with a seed. This function should generally be called very early in the setup.
 Parameters
seed (int or numpy.random.SeedSequence) – The random seed to initialize with.
keys – Additional keys, to use as a
spawn_key
on NumPy 1.17. Passed toderive_seed()
.propagate (bool) –
If
True
, initialize other RNG infrastructure. This currently initializes:np.random.seed()
If
propagate=False
, LensKit is still fully seeded — no component included with LensKit uses any of the global RNGs, they all use RNGs seeded with the specified seed.
 Returns
The random seed.

lenskit.util.random.
derive_seed
(*keys, base=None, none_on_old_numpy=False)¶ Derive a seed from the root seed, optionally with additional seed keys.
 Parameters
keys (list of int or str) – Additional components to add to the spawn key for reproducible derivation. If unspecified, the seed’s internal counter is incremented (by calling
numpy.random.SeedSequence.spawn()
).base (numpy.random.SeedSequence) – The base seed to use. If
None
, uses the root seed.none_on_old_numpy (bool) – If
True
, returnNone
instead of raisingNotImplementedError
if running on an old version of NumPy.

lenskit.util.random.
get_root_seed
()¶ Get the root seed.
 Returns
The LensKit root seed.
 Return type
Random Number Generators¶
These functions create actual RNGs from the LensKit global seed or a userprovided
seed. They can produce both newstyle numpy.random.Generator
RNGs and
legacy numpy.random.mtrand.RandomState
; the latter is needed because
some libraries, such as Pandas and scikitlearn, do not yet know what to do with
a newstyle RNG.

lenskit.util.random.
rng
(spec=None, *, legacy=False)¶ Get a random number generator. This is similar to
sklearn.utils.check_random_seed()
, but it usually returns anumpy.random.Generator
instead. Parameters
spec –
The spec for this RNG. Can be any of the following types:
int
None
numpy.random.mtrand.RandomState
legacy (bool) – If
True
, returnnumpy.random.mtrand.RandomState
instead of a newstylenumpy.random.Generator
.
 Returns
A random number generator.
 Return type

lenskit.util.random.
derivable_rng
(spec, *, legacy=False)¶ Get a derivable RNG, for use cases where the code needs to be able to reproducibly derive subRNGs for different keys, such as user IDs.
 Parameters
spec –
Any value supported by the seed parameter of
rng()
, in addition to the following values:the string
'user'
a tuple of the form (seed,
'user'
)
Either of these forms will cause the returned function to rederive new RNGs.
 Returns
A function taking one (or more) key values, like
derive_seed()
, and returning a random number generator (the type of which is determined by thelegacy
parameter). Return type
function