Evaluating Recommender Output

LensKit’s evaluation support is based on post-processing the output of recommenders and predictors. The batch utilities provide support for generating these outputs.

We generally recommend using Jupyter notebooks for evaluation.

When writing recommender system evaluation results for publication, it’s important to be precise about how exactly your metrics are being computed [TDV21]; to aid with that, each metric function’s documentation includes a mathematical definition of the metric.

Saving and Loading Outputs

In our own experiments, we typically store the output of recommendation runs in LensKit experiments in CSV or Parquet files, along with whatever parameters are relevant from the configuration.