Data Utilities

These are general-purpose data processing utilities.

Building Ratings Matrices

lenskit.data.sparse_ratings(ratings, scipy=False, *, users=None, items=None)

Convert a rating table to a sparse matrix of ratings.

Parameters
  • ratings (pandas.DataFrame) – a data table of (user, item, rating) triples.

  • scipy (bool) – if True or 'csr', return a SciPy csr matrix instead of CSR. if 'coo', return a SciPy coo matrix.

  • users (pandas.Index) – an index of user IDs.

  • items (pandas.Index) – an index of items IDs.

Returns

a named tuple containing the sparse matrix, user index, and item index.

Return type

RatingMatrix

class lenskit.data.RatingMatrix(matrix, users, items)

Bases: tuple

A rating matrix with associated indices.

matrix

The rating matrix, with users on rows and items on columns.

Type

CSR or scipy.sparse.csr_matrix

users

mapping from user IDs to row numbers.

Type

pandas.Index

items

mapping from item IDs to column numbers.

Type

pandas.Index

items

Alias for field number 2

matrix

Alias for field number 0

users

Alias for field number 1