Bias

The lenskit.algorithms.bias module contains the personalized mean rating prediction.

class lenskit.algorithms.bias.Bias(items=True, users=True, damping=0.0)

Bases: lenskit.Predictor

A user-item bias rating prediction algorithm. This implements the following predictor algorithm:

\[s(u,i) = \mu + b_i + b_u\]

where \(\mu\) is the global mean rating, \(b_i\) is item bias, and \(b_u\) is the user bias. With the provided damping values \(\beta_{\mathrm{u}}\) and \(\beta_{\mathrm{i}}\), they are computed as follows:

\[\begin{align*} \mu & = \frac{\sum_{r_{ui} \in R} r_{ui}}{|R|} & b_i & = \frac{\sum_{r_{ui} \in R_i} (r_{ui} - \mu)}{|R_i| + \beta_{\mathrm{i}}} & b_u & = \frac{\sum_{r_{ui} \in R_u} (r_{ui} - \mu - b_i)}{|R_u| + \beta_{\mathrm{u}}} \end{align*}\]

The damping values can be interpreted as the number of default (mean) ratings to assume a priori for each user or item, damping low-information users and items towards a mean instead of permitting them to take on extreme values based on few ratings.

Parameters
  • items – whether to compute item biases

  • users – whether to compute user biases

  • damping (number or tuple) – Bayesian damping to apply to computed biases. Either a number, to damp both user and item biases the same amount, or a (user,item) tuple providing separate damping values.

mean_

The global mean rating.

Type

double

item_offsets_

The item offsets (\(b_i\) values)

Type

pandas.Series

user_offsets_

The item offsets (\(b_u\) values)

Type

pandas.Series

fit(ratings, **kwargs)

Train the bias model on some rating data.

Parameters

ratings (DataFrame) – a data frame of ratings. Must have at least user, item, and rating columns.

Returns

the fit bias object.

Return type

Bias

transform(ratings, *, indexes=False)

Transform ratings by removing the bias term. This method does not recompute user (or item) biases based on these ratings, but rather uses the biases that were estimated with fit().

Parameters
  • ratings (pandas.DataFrame) – The ratings to transform. Must contain at least user, item, and rating columns.

  • indexes (bool) – if True, the resulting frame will include uidx and iidx columns containing the 0-based user and item indexes for each rating.

Returns

A data frame with rating transformed by subtracting user-item bias prediction.

Return type

pandas.DataFrame

inverse_transform(ratings)

Transform ratings by removing the bias term.

transform_user(ratings)

Transform a user’s ratings by subtracting the bias model.

Parameters

ratings (pandas.Series) – The user’s ratings, indexed by item. Must have at least item as index and rating column.

Returns

The transformed ratings and the user bias.

Return type

pandas.Series

inverse_transform_user(user, ratings, user_bias=None)

Un-transform a user’s ratings by adding in the bias model.

Parameters
  • user – The user ID.

  • ratings (pandas.Series) – The user’s ratings, indexed by item.

  • user_bias (float or None) – If None, it looks up the user bias learned by fit.

Returns

The user’s de-normalized ratings.

Return type

pandas.Series

fit_transform(ratings, **kwargs)

Fit with ratings and return the training data transformed.

predict_for_user(user, items, ratings=None)

Compute predictions for a user and items. Unknown users and items are assumed to have zero bias.

Parameters
  • user – the user ID

  • items (array-like) – the items to predict

  • ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, will be used to recompute the user’s bias at prediction time.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series

property user_index

Get the user index from this (fit) bias.

property item_index

Get the item index from this (fit) bias.