Bayesian Personalized Ranking

This is a TensorFlow-based implementation of BPR.

class lenskit_tf.BPR(features=50, *, epochs=5, batch_size=10000, reg=0.02, neg_count=1, neg_weight=True, rng_spec=None)

Bases: Predictor

Bayesian Personalized Ranking with matrix factorization, optimized with TensorFlow.

This is a basic TensorFlow implementation of the BPR algorithm _[BPR].

User and item embedding matrices are regularized with \(L_2\) regularization, governed by a regularization term \(\lambda\). Regularizations for the user and item embeddings are then computed as follows:

\[\begin{split}\lambda_u = \lambda / |U| \\ \lambda_i = \lambda / |I| \\\end{split}\]

This rescaling allows the regularization term to be independent of the number of users and items.

Because the model is relatively simple, optimization works best with large batch sizes.

Parameters
  • features (int) – The number of latent features to learn.

  • epochs (int) – The number of epochs to train.

  • batch_size (int) – The Keras batch size. This is the number of positive examples to sample in each batch. If neg_count is greater than 1, the batch size will be similarly multipled.

  • reg (double) – The regularization term for the embedding vectors.

  • neg_count (int) – The number of negative examples to sample for each positive one.

  • neg_weight (bool) – Whether to weight negative sampling by popularity (True) or not.

  • rng_spec – The random number generator initialization.

model

The Keras model.

fit(ratings, **kwargs)

Train a model using the specified ratings (or similar) data.

Parameters
  • ratings (pandas.DataFrame) – The ratings data.

  • kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.

Returns

The algorithm object.

predict_for_user(user, items, ratings=None)

Compute predictions for a user and items.

Parameters
  • user – the user ID

  • items (array-like) – the items to predict

  • ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series