Biased MF
These models implement the standard biased matrix factorization model, like
lenskit.algorithms.als.BiasedMF
, but learn the model parameters
using TensorFlow’s gradient descent instead of the alternating least squares
algorithm. There are two implementations:
lenskit_tf.BiasedMF
learns a matrix factorization to predict the residuals oflenskit.algorithms.bias.Bias
.lenskit_tf.IntegratedBiasMF
uses TensorFlow to learn the entire model, including both biases and embeddings.
Bias-Based
- class lenskit_tf.BiasedMF(features=50, *, bias=True, damping=5, epochs=5, batch_size=10000, reg=0.02, rng_spec=None)
Bases:
MFPredictor
Biased matrix factorization model for explicit feedback, optimized with TensorFlow.
This is a basic TensorFlow implementation of the biased matrix factorization model for rating prediction:
\[s(i|u) = b + b_u + b_i + \vec{p}_u \cdot \vec{q_i}\]User and item embedding matrices are regularized with \(L_2\) regularization, governed by a regularization term \(\lambda\). Regularizations for the user and item embeddings are then computed as follows:
\[\begin{split}\lambda_u = \lambda / |U| \\ \lambda_i = \lambda / |I| \\\end{split}\]This rescaling allows the regularization term to be independent of the number of users and items.
Because the model is very simple, this algorithm works best with large batch sizes.
This implementation uses
lenskit.algorithms.bias.Bias
for computing the biases, and uses TensorFlow to fit a matrix factorization on the residuals. It then extracts the resulting matrices, and relies onMFPredictor
to implement the prediction logic, likelenskit.algorithms.als.BiasedMF
. Its code is suitable as an example of how to build a Keras/TensorFlow algorithm implementation for LensKit where TF is only used in the train stage.A variety of resources informed the design, most notably this one.
- Parameters
features (int) – The number of latent features to learn.
bias – The bias model to use.
damping – The bias damping, if
bias
isTrue
.epochs (int) – The number of epochs to train.
batch_size (int) – The Keras batch size.
reg (double) – The regularization term \(\lambda\) used to derive embedding vector regularizations.
rng_spec – The random number generator initialization.
- fit(ratings, **kwargs)
Train a model using the specified ratings (or similar) data.
- Parameters
ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.
- Returns
The algorithm object.
- predict_for_user(user, items, ratings=None)
Compute predictions for a user and items.
- Parameters
user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns
scores for the items, indexed by item id.
- Return type
Fully Integrated
- class lenskit_tf.IntegratedBiasMF(features=50, *, epochs=5, batch_size=10000, reg=0.02, bias_reg=0.2, rng_spec=None)
Bases:
Predictor
Biased matrix factorization model for explicit feedback, optimizing both bias and embeddings with TensorFlow.
This is a basic TensorFlow implementation of the biased matrix factorization model for rating prediction:
\[s(i|u) = b + b_u + b_i + \vec{p}_u \cdot \vec{q_i}\]User and item embedding matrices are regularized with \(L_2\) regularization, governed by a regularization term \(\lambda\). Regularizations for the user and item embeddings are then computed as follows:
\[\begin{split}\lambda_u = \lambda / |U| \\ \lambda_i = \lambda / |I| \\\end{split}\]This rescaling allows the regularization term to be independent of the number of users and items. The same rescaling applies to the bias regularization.
Because the model is very simple, this algorithm works best with large batch sizes.
This implementation uses TensorFlow to fit the entire model, including user/item biases and residuals, and uses TensorFlow to do the final predictions as well. Its code is suitable as an example of how to build a Keras/TensorFlow algorithm implementation for LensKit where TF used for the entire process.
A variety of resources informed the design, most notably this one and `Chin-chi Hsu's example code`_.
- Parameters
features (int) – The number of latent features to learn.
epochs (int) – The number of epochs to train.
batch_size (int) – The Keras batch size.
reg (double) – The regularization term for the embedding vectors.
bias_reg (double) – The regularization term for the bias vectors.
rng_spec – The random number generator initialization.
- model
The Keras model.
- fit(ratings, **kwargs)
Train a model using the specified ratings (or similar) data.
- Parameters
ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.
- Returns
The algorithm object.
- predict_for_user(user, items, ratings=None)
Compute predictions for a user and items.
- Parameters
user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns
scores for the items, indexed by item id.
- Return type