Compare multiple models

As you continue iterating on your model and your data, you will likely end up with many Model Runs. Every Model Run represents an experiment (i.e., an instance of training a specific model on some specific data).

At this point, machine learning teams typically want to compare the predictions and the performance of their different models. The goal of comparing models is to measure and understand the marginal value of every machine learning iteration (additional labels, re-worked labels, modeling improvements, hyperparameter fine-tuning...).

Models is designed to help you compare Model Runs: you can visualize predictions and compare metrics between Model Runs.

How model comparison works

You will need a model with 2 or more Model Runs to use this feature. Visit these pages to get started:

Select Model Runs

Once you select the first Model Run, use the drop-down along the menu bar labeled Compare against. Then, select the second Model Run to run the comparison. Labelbox automatically assigns each Model Run a different color to distinguish them in metrics and visualization.


Compare a Model Run, inside a Model, to any other Model Run, inside the same Model


Compare two Model Runs visually

From the gallery, you can click on an individual data row to expand it. Using the sidebar you can toggle which ground truth and Model Runs to view. To learn more about how view predictions look here.


Compare two Model Runs with metrics

We support comparison of scalar and confusion matrix metrics.


Did this page help you?