Utility Functions

These utility functions are useful for data processing.

Miscellaneous

Miscellaneous utility functions.

lenskit.util.log_to_stderr(level=20)

Set up the logging infrastructure to show log output on sys.stderr, where it will appear in the IPython message log.

lenskit.util.log_to_notebook(level=20)

Set up the logging infrastructure to show log output in the Jupyter notebook.

class lenskit.util.Stopwatch(start=True)

Bases: object

Timer class for recording elapsed wall time in operations.

lenskit.util.derivable_rng(spec)

Get a derivable RNG, for use cases where the code needs to be able to reproducibly derive sub-RNGs for different keys, such as user IDs.

Parameters:

spec

Any value supported by the seed parameter of seedbank.numpy_rng(), in addition to the following values:

  • the string 'user'

  • a tuple of the form (seed, 'user')

Either of these forms will cause the returned function to re-derive new RNGs.

Returns:

A function taking one (or more) key values, like derive_seed(), and returning a random number generator.

Return type:

function

lenskit.util.proc_count(core_div=2, max_default=None, level=0)

Get the number of desired jobs for multiprocessing operations. This does not affect Numba or MKL multithreading.

This count can come from a number of sources:

  • The LK_NUM_PROCS environment variable

  • The number of CPUs, divided by core_div (default 2)

Parameters:
  • core_div (int or None) – The divisor to scale down the number of cores; None to turn off core-based fallback.

  • max_default – The maximum number of processes to use if the environment variable is not configured.

  • level – The process nesting level. 0 is the outermost level of parallelism; subsequent levels control nesting. Levels deeper than 1 are rare, and it isn’t expected that callers actually have an accurate idea of the threading nesting, just that they are configuring a child. If the process count is unconfigured, then level 1 will use core_div, and deeper levels will use 1.

Returns:

The number of jobs desired.

Return type:

int

lenskit.util.clone(algo)

Clone an algorithm, but not its fitted data. This is like sklearn.base.clone(), but may not work on arbitrary SciKit estimators. LensKit algorithms are compatible with SciKit clone, however, so feel free to use that if you need more general capabilities.

This function is somewhat derived from the SciKit one.

>>> from lenskit.algorithms.bias import Bias
>>> orig = Bias()
>>> copy = clone(orig)
>>> copy is orig
False
>>> copy.damping == orig.damping
True