Key definitions

Below are the key definitions that you will see in the product, API, and the docs.

Term

Definition

Annotation

An instance of a Feature. Annotations can be imported as ground truth, model predictions, or can be created in the Labelbox Editor. Annotations are categorized as Objects (e.g. bounding box, polygon, etc) or Classifications (e.g. radio, checklist, etc).

Attachments

Supplementary information you can attach to an asset that provides contextual information used as an aid during labeling. Add attachments

Asset

A single cloud-hosted file to be labeled (e.g., an image, a video, or a text file).

Batches/Batch mode

A method for sending individual Data Rows to a Project for labeling. This is an alternative to attaching a Dataset to a project.

Boost

Boost is a service that Labelbox offers to help enterprises scale up their AI/ML operations. Includes a variety of professional services and software assistance.

Catalog

A Labelbox feature that functions as a training data warehouse where customers can use filters to browse, curate, and develop insights across all labeled and unlabeled data in Labelbox.
A warehouse containing all of the Data Rows within an organization.

Data Row

The container that houses all of the following information for a single Asset:

  • URL to your cloud-hosted file
  • Metadata
  • Media attributes (e.g., data type, size, etc.)
    -Attachments (files that provide context for your labelers)

Data type

Type of Data Row such as image (JPG/PNG), Video (MP4), text (.txt files)

Dataset

A set of Data Rows that you add to Labelbox for labeling.

Editor

The labeling interface you can use to create, review, and edit annotations. When you create a project, you will be prompted to configure your Editor (i.e., select an ontology, add labeling instructions, etc).

Feature

A feature is the master definition of what you want the model to predict. It is also the blueprint for your ground truth. An ontology is made up of a set of features.
There are two kinds of features: objects (e.g., Bounding box) and classifications (e.g., Radio). A feature can have multiple deeply nested sub-classifications.

Ground truth

Ground truth is information that is known to be real or true, provided by direct observation and measurement (i.e. empirical evidence such as labels made by humans) as opposed to information provided by inference.

Label

A collection of all annotations on a Data Row. For example, all Bounding boxes, Polylines, and Radio classifications on an image would be considered the "Label".

Mask

An image representation (in PNG format) of a segmentation annotation sans asset.

Media attributes

Upon upload, Labelbox automatically computes media attributes for each Data Row. It includes useful information like mimeType, width, height, codec, etc.

Metadata

Metadata is non-annotation information about the asset to be labeled. There are two types of metadata: reserved keys (user cannot change) and custom (user-defined). Metadata is useful for searching and filtering across your Data Rows in Labelbox.

Model

A Model is a directory where you can create, manage and compare a set of Model Runs related to a same machine learning task. Each Model is specified by an ontology of data: it defines the machine learning task of the Model Runs inside the directory.

Model Diagnostics

A Labelbox product area that enables you to run experiments (Model Runs) on your machine learning models to analyze model performance across each Model Run.

Model Run

A Model Run is a model training experiment within a Model directory. Each Model Run has its data snapshot (data rows, annotations, and data splits) versioned. You can upload predictions to a Model Run, and compare its performance against other Model Runs in the Model directory.

Nested classification

A classification-type annotation that is nested within an object-type annotation (as opposed to a global classification).

Ontology

A collection of Features and their relationships (also known as a taxonomy). Ontologies can be reused across different projects. It is essential for data labeling, model training, and evaluation. When you are in the Editor, the ontology is what appears in the "Tools" panel.

Prediction

Output from your ML model that you can add to a Data Row to serve as a template for faster labeling.

Project

The labeling environment in Labelbox, like a factory assembly line for producing labels. The initial state of the project can start with raw data, pre-existing ground truth, or pre-labeled data.

Queue

Labelbox has four queues for moving Data Rows through the labeling & QA pipeline: the Batches queue, the Labeling queue, the Review queue, and the Tasks queue.

Schema

Nearly everything in Labelbox is strongly typed. The schema is the master blueprint for your training data. It contains Ontologies, Features, and Metadata.

Data split

You can split the selected data rows into train, validation, and test splits to prepare for model training and evaluation.

Template

If you have a Data Row that needs to be relabeled, you can delete the annotations and select to have the existing annotations as a template the next time the Data Row appears in the Editor. This allows you to make corrections to a set of annotations, rather than start from scratch.

Workspaces

Enables admins at large organizations to manage multiple instances of Labelbox with the same login.


Did this page help you?