Bias

The lenskit.algorithms.bias module contains the personalized mean rating prediction.

class lenskit.algorithms.bias.Bias(items=True, users=True, damping=0.0)

Bases: Predictor

A user-item bias rating prediction algorithm. This implements the following predictor algorithm:

\[s(u,i) = \mu + b_i + b_u\]

where \(\mu\) is the global mean rating, \(b_i\) is item bias, and \(b_u\) is the user bias. With the provided damping values \(\beta_{\mathrm{u}}\) and \(\beta_{\mathrm{i}}\), they are computed as follows:

\[\begin{align*} \mu & = \frac{\sum_{r_{ui} \in R} r_{ui}}{|R|} & b_i & = \frac{\sum_{r_{ui} \in R_i} (r_{ui} - \mu)}{|R_i| + \beta_{\mathrm{i}}} & b_u & = \frac{\sum_{r_{ui} \in R_u} (r_{ui} - \mu - b_i)}{|R_u| + \beta_{\mathrm{u}}} \end{align*}\]

The damping values can be interpreted as the number of default (mean) ratings to assume a priori for each user or item, damping low-information users and items towards a mean instead of permitting them to take on extreme values based on few ratings.

Parameters

items – whether to compute item biases
users – whether to compute user biases
damping (number or tuple) – Bayesian damping to apply to computed biases. Either a number, to damp both user and item biases the same amount, or a (user,item) tuple providing separate damping values.

mean_

The global mean rating.

Type: double

item_offsets_

The item offsets (\(b_i\) values)

Type: pandas.Series

user_offsets_

The item offsets (\(b_u\) values)

Type: pandas.Series

fit(ratings, **kwargs)

Train the bias model on some rating data.

Parameters: ratings (DataFrame) – a data frame of ratings. Must have at least user, item, and rating columns.
Returns: the fit bias object.
Return type: Bias

transform(ratings, *, indexes=False)

Transform ratings by removing the bias term. This method does not recompute user (or item) biases based on these ratings, but rather uses the biases that were estimated with fit().

Parameters

ratings (pandas.DataFrame) – The ratings to transform. Must contain at least user, item, and rating columns.
indexes (bool) – if True, the resulting frame will include uidx and iidx columns containing the 0-based user and item indexes for each rating.

Returns

A data frame with rating transformed by subtracting user-item bias prediction.

Return type

pandas.DataFrame

inverse_transform(ratings): Transform ratings by removing the bias term.

transform_user(ratings)

Transform a user’s ratings by subtracting the bias model.

Parameters: ratings (pandas.Series) – The user’s ratings, indexed by item. Must have at least item as index and rating column.
Returns: The transformed ratings and the user bias.
Return type: pandas.Series

inverse_transform_user(user, ratings, user_bias=None)

Un-transform a user’s ratings by adding in the bias model.

Parameters

user – The user ID.
ratings (pandas.Series) – The user’s ratings, indexed by item.
user_bias (float or None) – If None, it looks up the user bias learned by fit.

Returns

The user’s de-normalized ratings.

Return type

pandas.Series

fit_transform(ratings, **kwargs): Fit with ratings and return the training data transformed.

predict_for_user(user, items, ratings=None)

Compute predictions for a user and items. Unknown users and items are assumed to have zero bias.

Parameters

user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, will be used to recompute the user’s bias at prediction time.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series

property user_index: Get the user index from this (fit) bias.

property item_index: Get the item index from this (fit) bias.