Quality assurance

Tools and workflows to achieve desired data quality

Labelbox offers modern tools and proven workflows that you can configure to achieve your desired data quality while keeping human supervision costs low. In Labelbox, data and its associated labels can be edited, reviewed, and re-used anytime, enabling the machine learning team to rapidly iterate.

Available tools


Every project has one highly customizable review workflow. A workflow is made up of a series of tasks - each serving a specialized purpose in the overall review flow. For example, if you want all data rows that contain stop sign annotations to be reviewed, you can add a task and set filtering criteria so that only data rows with stop sign annotations will be admitted to this task. You can also create a task for reviewing data rows labeled by a particular labeler or a group of labelers. Each step in the workflow is configurable so you have the most flexibility in creating your review pipeline.

Issues & comments
Labeling data is an inherently collaborative process that requires continuous feedback between labelers, reviewers, and the machine learning team to ensure high-quality outcomes. In the Labelbox Editor, you can facilitate this collaboration throughout the labeling and reviewing process by creating an issue on the asset and opening it up for discussion in the comments section.

The Benchmarks tool allows you to designate a labeled asset as a “gold standard” and automatically compare all other annotations on that asset to the Benchmark.

The Consensus tool allows you to automatically compare the annotations on a given asset to all other annotations on that asset. Consensus works in real-time so you can take immediate and corrective actions towards improving your training data and model performance.

Review step (legacy)
Review labels (approve, reject, and/or make corrections) with a review team after the first labels are created.

How to choose the right configuration?

Labelbox recommends that you start off simple when configuring the review process and using Issues & comments. Familiarize yourself with the data, discover edge cases in data, and educate the labeling team through iteration with labeling instructions.

Benchmarks and Consensus are particularly useful in highly specialized and subjective labeling tasks.