Model Sharing

The lenskit.sharing module provides utilities for managing models and sharing them between processes, particularly for the multiprocessing in lenskit.batch.

Sharing Mode

The only piece algorithm developers usually need to directly handle is the concept of ‘sharing mode’ when implementing custom pickling logic. To save space, it is reasonable to exclude intermediate data structures, such as caches or inverse indexes, from the pickled representation of an algorithm, and reconstruct them when the model is loaded.

However, LensKit’s multi-process sharing also uses pickling to capture the object state while using shared memory for numpy.ndarray objects. In these cases, the structures should be pickled, so they can be shared between model instances.

To support this, we have the concept of sharing mode. Code that excludes objects when pickling should call in_share_context() to determine if that exclusion should actually happen.

lenskit.sharing.in_share_context()

Query whether sharing mode is active. If True, we are currently in a sharing_mode() context, which means model pickling will be used for cross-process sharing.

lenskit.sharing.sharing_mode(*args, **kwds)

Context manager to tell models that pickling will be used for cross-process sharing, not model persistence.

Persistence API

These functions are used for internal LensKit infrastructure code to persist models into shared memory for parallel processing.

lenskit.sharing.persist(model, *, method=None)

Persist a model for cross-process sharing.

This will return a persiste dmodel that can be used to reconstruct the model in a worker process (using reconstruct()).

If no method is provided, this function automatically selects a model persistence strategy from the the following, in order:

  1. If LK_TEMP_DIR is set, use binpickle in shareable mode to save the object into the LensKit temporary directory.

  2. If multiprocessing.shared_memory is available, use pickle to save the model, placing the buffers into shared memory blocks.

  3. Otherwise, use binpickle in shareable mode to save the object into the system temporary directory.

Parameters
  • model (obj) – The model to persist.

  • method (str or None) – The method to use. Can be one of binpickle or shm.

Returns

The persisted object.

Return type

PersistedModel

class lenskit.sharing.PersistedModel

Bases: abc.ABC

A persisted model for inter-process model sharing.

These objects can be pickled for transmission to a worker process.

Note

Subclasses need to override the pickling protocol to implement the proper pickling implementation.

abstract get()

Get the persisted model, reconstructing it if necessary.

abstract close()

Release the persisted model resources. Should only be called in the parent process (will do nothing in a child process).

transfer()

Mark an object for ownership transfer. This object, when pickled, will unpickle into an owning model that frees resources when closed. Used to transfer ownership of shared memory resources from child processes to parent processes. Such an object should only be unpickled once.

The default implementation sets the is_owner attribute to 'transfer'.

Returns

self (for convenience)