Find and fix labeling mistakes

Generally speaking, labeling teams and machine learning teams care about surfacing poor-quality labels for 2 reasons:

  • Your model is only as good as the data you train it on. Therefore, it is critical to use high-quality labels for training your model.
  • Finding labeling mistakes help give feedback to labelers to help them improve.

With Labelbox, you can easily find and fix labeling errors. The goal is to surface Data Rows where model predictions and ground truth labels disagree (due to labeling mistakes). It is best practice to rework these poor-quality labels to ensure high-performing labeling teams as well as a robust machine learning model.

Use a trained model to find label errors

Model predictions and model metrics are useful tools for finding incorrectly labeled data. Machine learning models have different performance characteristics than human labelers. For example, a model—unlike a human—does not get tired.

To achieve this workflow, you first have to upload model predictions and upload model metrics on your labeled data. In other words, you should upload predictions and metrics to the Model Run that contains the labeled data used to train your model.



A great way to surface label errors is to find predictions where the model disagrees strongly with ground truth labels, yet the model is very confident.

Find and fix labeling mistakes

  1. Go to the Models tab. Open the Model and Model Run you want to find label errors on.

  2. Filter Data Rows to keep only disagreements between model predictions and ground truth labels. To do so, you can add a filter on metrics in order to keep only Data Rows with low metrics (e.g. IOU between 0 and 0.5).

Surface model disagreementsSurface model disagreements

Surface model disagreements

  1. Surface Data Rows where the model is most confident. To do so, you can sort Data Rows by decreasing order of confidence. This assumes you have uploaded model confidence, as a custom Scalar metric, to the Model Run. Predictions that have low metrics (e.g. IOU) and high model confidence tend to correspond to labeling mistakes.
  1. Then, you can manually inspect these surface Data Rows in detail. It is common for machine learning teams to manually inspect hundreds of Data Rows, to capture as many label errors as possible. To do so, click on the thumbnails. It opens the in the Detailed view. Models predictions appear in red while ground truth annotations appear in green.

In our example, the first (surfaced) image we inspect contains two labeling errors:

  • a plant in the bottom left is capture by the model, but labeled was missing
  • two plants on the left were labeled with a single bounding box

Even though our model is not perfect (e.g. the model fails to predict a plant in the top), it is still helpful to find labeling errors.

  1. Now that we have surfaced label mistakes, you can select them, and open them in the Catalog by clicking on View in Catalog. From the Catalog, you can send these Data Rows for labeling rework.

Use embeddings to find label errors

The Projector view is a powerful way to find labeling mistakes.

In the Projector view, you can:

  • Click on any point to preview the corresponding Data Row
  • Select a region of the screen, to preview all Data Rows inside it

By coloring the projector view, for each class, you might notice suspicious points. For instance, a Data Row containing the basketball_court annotation, in the middle of the ground_track_fieldcluster, is likely to be a labeling mistake.

This label seems out of distribution. Click on it to inspect it and check if it's a labeling mistake.This label seems out of distribution. Click on it to inspect it and check if it's a labeling mistake.

This label seems out of distribution. Click on it to inspect it and check if it's a labeling mistake.

Did this page help you?