Evaluate model performance

Once you upload predictions and upload metrics to your Model Run, you can evaluate your model performance using Models. Measuring your model's performance, with quantitative metrics, is critical for understanding its behavior.

Follow these steps to easily measure and report on model quality:

  1. Go to the Models tab.

  2. Select the Model want to evaluate.

  3. Select the Model Run you want to evaluate. By default, you will see the Gallery view.

  4. Go to the Metrics view by clicking the icon on the right (see screenshot below). This is the view to see model metrics and measure model performance.


Scalar Metrics

Scalar metrics (positive real value metrics) show up as histograms in the user interface. Each bar of the histogram corresponds to a class.

A powerful feature of Labelbox is that histograms are interactive. If you click on any bar of any histogram, it will open the gallery view in the Models tab and automatically filter and sort the Model Run data. More precisely:

  • Labelbox will filter only Data Rows corresponding to the bar of the histogram you clicked on
  • Labelbox will sort Data Rows based on the metric of the histogram you clicked on

This way, you quickly gain insights about your model's behavior, by easily toggling between a quantitative and qualitative view of your Model Run.


Confusion Matrix Metrics

A ConfusionMatrixMetric contains 4 numbers (true positive, false positive, true negative, false negative). In the user interface, these metrics are used to derive precision, recall, and f1 scores.

Confusion Matrix Metrics are not clickable. However, you can filter and sort these metrics, both in the Metrics view and in the Gallery view.


Model metrics on each data split

Machine learning teams typically want to measure model metrics on a particular data split. You can visualize model metrics for a specific data split by clicking on Train, Validate, or Test. The metrics will update, in the user interface, to reflect the performance of the model on each split of the Model Run.


Model metrics update dynamically based on the data split you select

Model metrics on a subset of data

It is common practice to evaluate model performance on a subset of the training data. Labelbox enables you to apply a set of filters to your Model Run and inspect your model's performance on the filtered Data Rows.


Model metrics update dynamically based on the filters you select

Slices: saving a set of filters

You can save any set of filters, by turning the filtered Data Rows into a ​Slice. To do so, click on the Save slice button. You can then easily assess these Data Rows by clicking on Select slice... and selecting the Slice you have created.

Did this page help you?