Key definitions

A glossary of key terms used throughout the Labelbox platform.

These terms appear throughout the Labelbox platform, including the app, the SDK, and the docs.

A, B

Annotation

A human-made or computer-generated label on an asset. Annotations can be imported (as ground truth or pre-labels) or they can be created manually in the Labelbox editor.

Annotations are categorized as objects (such as bounding box or polygon) or classifications (e.g. radio, checklist, etc).

Attachments

Supplementary information you can attach to an asset in order to provide contextual information for your labeling team.

When viewing data rows in detail view, attachments appear on a separate side panel.

For SDK help, see Attachments (SDK)

Asset

Assets (or _data assets_) are individual files to be labeled, such as an image, a video, or a text file. Can be hosted in a cloud bucket, uploaded from a local file location, or copied from a remote data source.

Batch (batch mode)
A method for selecting data rows from Catalog and sending them to a labeling project. Sending batches of data rows to a labeling project is an alternative to attaching an entire dataset to a project.
Benchmark
The Benchmark tool lets you designate a labeled asset as a “gold standard” and automatically compare all other labels on that asset to the benchmark label.
Boost
Boost is a service that helps enterprise customers scale machine learning (ML) operations up. Boost includes a variety of professional services and software assistance, including a labeling workforce.

C, D

Catalog

An organization-wide platform for curating and exploring your unstructured data. Catalog enables you to easily browse, curate, and develop insights across all labeled and unlabeled data rows in your organization.

Consensus

The Consensus tool lets you compare labelers against each other by comparing annotations on a given asset.

Consensus works in real-time so you can take immediate and corrective actions toward boosting team and model performance.

Data row

Represents an individual data asset, along with associated attributes (such as global ID) and annotations, which can include:

  • URL to your cloud-hosted file
  • Metadata
  • Media attributes (e.g., data type, size, etc.)
  • Attachments (files that provide context for your labelers)
  • Predictions

Data split

You can split the selected data rows into train, validation, and test splits to prepare for model training and evaluation.

Data type

Type of data row such as image (JPG/PNG), Video (MP4), text (.txt files). For a list of supported data types, see Data types & import specs

Dataset

Datasets are containers for data rows; they collect a set of related data assets.

E - L

Editor

The labeling interface you can use to create, review, and edit annotations.

When creating a new project, you're prompted to configure the editor, which defines the data type and the interface used while labeling.

Feature

A feature is the master definition of what you want the model to predict. It is also the blueprint for your ground truth.

Ontologies consist of features, which include objects (example: bounding box) and classifications (radio buttons). Features can have multiple, nested classifications.

Ground truth

A ground truth is information that is known to be real or true, as supported by direct observation and measurement. Labels made by humans are considered to be empirical ground truths, as opposed to labels added through model inference.

Label

A collection of all annotations on a data row. if a data row had multiple bounding boxes, polylines, and radio classifications, the entire collection would be considered the "label."

M - O

Media attributes

When you upload data assets, Labelbox automatically computes media attributes appropriate for the data type and stores their values as part of the data row. Examples include mimeType, width, height, codec, and more. For details, see Media attributes

Metadata

Metadata is non-annotation information about the asset to be labeled. There are two types of metadata: reserved keys (which cannot be changed) and custom (user-defined). Metadata helps search and filter data rows.

Model

A Model is a directory where you can create, manage, and compare a set of Model Runs related to the same machine learning task. Each Model is specified by an ontology of data: it defines the machine learning task of the Model Runs inside the directory.

Model run

A model run is a model training experiment within a model directory. Each model run has its data snapshot (data rows, annotations, and data splits) versioned. You can upload predictions to a model run, and compare results and performance against other model runs in the model directory.

Nested classification

A classification-type annotation that is nested within an object-type annotation (as opposed to a global classification).

Ontology

A collection of features and their relationships (also known as a taxonomy).

Ontologies can be reused across different projects. Ontologies are essential for data labeling, model training, and evaluation.

When you label or review a data asset, the ontology appears in the Tools panel.

P - Z

Prediction

Output from your machine learning model that you can add to a data row to serve as a template for faster labeling.

Project

The labeling environment in Labelbox, like a factory assembly line for producing labels. The initial state of the project can start with raw data, pre-existing ground truth, or pre-labeled data.

Queue

Labelbox has three queues to help move data rows through the labeling and review workflow: the batches queue, the labeling queue, and the review tasks queue.

Schema

The schema is the master blueprint for your training data and includes ontologies, features, and metadata.

Template

If a data row needs to be relabeled, you can delete the annotations and then select existing annotations to use as a template for the next data row displayed in the editor.

This allows you to curate a set of annotations, rather than start from scratch for each data row.

Workflow

A workflow is a queue for labeling and reviewing assets within a project. Workflows provide granular control over data row reviews. Workflows are highly customizable and help define a step-by-step pipeline leading to an efficient and more accurate process.

Workspace

Enables admins at large organizations to manage multiple instances of Labelbox with the same subscription account.