Consensus

Learn how to set up consensus scoring to analyze the quality of your labels.

Consensus represents the agreement between your labeling workforce. Consensus agreement scores are calculated in real-time for features and labels with multiple annotations by different labelers. Whenever an annotation is created, updated, or deleted, the consensus score is recalculated for data rows with two or more labels.

Supported types

Currently, consensus scoring is supported for the following asset and annotation types:

Asset typeBounding boxPolygonPolylinePointSegmentation maskEntityRelationshipRadioChecklistFree-form text
ImageN/AN/A-
Video--N/AN/A-
TextN/AN/AN/AN/AN/A--
ChatN/AN/AN/AN/AN/A--
AudioN/AN/AN/AN/AN/AN/AN/A-
GeospatialN/AN/A-
DocumentsN/AN/AN/AN/AN/A
HTMLN/AN/AN/AN/AN/AN/AN/A
Human-generated responsesN/AN/AN/AN/AN/AN/AN/A

Set up consensus scoring

When adding data rows to an Annotate project, use the Queue batch option to enable consensus scoring and configure additional settings. You can't change these settings after submission.

Consensus settingDescription
Data row priorityThe position in the labeling queue these data rows will be slotted based on priority.
% coverageThe percentage of the data rows in the batch that will enter the labeling queue as consensus data rows for multi-labeling. Defaults to 0.
# labelsThe number of labels that each consensus data row will be added. Defaults to 2. Must be greater than or equal to the number of labelers on the project.

📘

Consensus calculation can take up to five minutes

Select consensus winners

After a data row is labeled and enters the review stage, the first set of annotations entered for a data row represents consensus by default. Reviewers can reassign consensus to another set of annotations once the data row has more than one label.

If your data row has been labeled more than once, you'll view all of the label entries on that data row in the data row browser. The following example shows a data row with two sets of labels. The green trophy icon indicates that the first set of annotations is considered "consensus."

To change consensus, click the trophy icon next to the preferred annotations.

🚧

Recalculation of consensus agreement scores

The consensus score reflects agreement among labelers, so changing the winning label might lead to a recalculation of the score based on the new consensus.

📘

Set consensus winners as benchmark references

You can designate consensus winners as benchmarks. See Set up benchmarks.

Search and filter data using consensus scores

The Consensus agreement filter helps you find qualified data rows based on consensus scores. You can apply this filter in the following locations:

When using the filter, you can configure the following options:

  • Scope: Specify the type of agreement to measure:
    • Feature-level measures the agreement on a specific feature schema in the ontology for each data row. If you select this option, further specify one or more feature schemas in the ontology using the dropdown menu.
      • Label-level evaluates the overall agreement across all annotations within a single data row.
  • Calculation: Choose whether to calculate the agreement as an absolute or average score.
  • Range (0-1): Set the score range from 0 to 1, where 0 indicates no agreement among annotators and 1 indicates complete agreement.