Key definitions

A glossary of key terms used throughout the Labelbox platform.

Below are the key definitions that you will see in the product, API, and the docs.

TermDefinition
AnnotationA human-made or computer-generated label on an asset. Annotations can be imported (as ground truth or pre-labels) or they can be created manually in the Labelbox editor. Annotations are categorized as Objects (e.g. bounding box, polygon, etc) or Classifications (e.g. radio, checklist, etc).
AttachmentsSupplementary information you can attach to an asset in order to provide contextual information for your labeling team. Attachments appear on a separate side panel Add attachments
AssetA single cloud-hosted file to be labeled (e.g., an image, a video, or a text file).
Batches/batch modeA method for selecting data rows from Catalog and sending them to a labeling project. Sending batches of data rows to a labeling project is an alternative to attaching an entire dataset to a project.
BenchmarksThe Benchmarks tool enables you to designate a labeled asset as a “gold standard” and automatically compare all other labels on that asset to the benchmark label.
BoostBoost is a service that helps enterprise customers scale up their AI/ML operations. Boost includes a variety of professional services and software assistance, including a labeling workforce.
CatalogAn organization-wide platform for curating and exploring your unstructured data. Catalog enables you to easily browse, curate, and develop insights across all labeled and unlabeled data rows in your organization.
ConsensusThe Consensus tool allows you to automatically compare labelers against each other by comparing annotations on a given asset. Consensus works in real-time so you can take immediate and corrective actions toward boosting team and model performance.
Data rowThe container that houses all of the following information for a single Asset:
- URL to your cloud-hosted file
- Metadata
- Media attributes (e.g., data type, size, etc.)
- Attachments (files that provide context for your labelers)
Data splitYou can split the selected data rows into train, validation, and test splits to prepare for model training and evaluation.
Data typeType of Data Row such as image (JPG/PNG), Video (MP4), text (.txt files)
DatasetA set of Data Rows that you add to Labelbox for labeling.
EditorThe labeling interface you can use to create, review, and edit annotations. When you create a project, you will be prompted to configure your Editor (i.e., select an ontology, add labeling instructions, etc).
FeatureA feature is the master definition of what you want the model to predict. It is also the blueprint for your ground truth. An ontology is made up of a set of features.
There are two kinds of features: objects (e.g., Bounding box) and classifications (e.g., Radio). A feature can have multiple deeply nested sub-classifications.
Ground truthGround truth is information that is known to be real or true, provided by direct observation and measurement (i.e. empirical evidence such as labels made by humans) as opposed to information provided by inference.
LabelA collection of all annotations on a Data Row. For example, all Bounding boxes, Polylines, and Radio classifications on an image would be considered the "Label".
Media attributesUpon upload, Labelbox automatically computes media attributes for each Data Row. It includes useful information like mimeType, width, height, codec, etc.
MetadataMetadata is non-annotation information about the asset to be labeled. There are two types of metadata: reserved keys (user cannot change) and custom (user-defined). Metadata is useful for searching and filtering across your Data Rows in Labelbox.
ModelA Model is a directory where you can create, manage and compare a set of Model Runs related to a same machine learning task. Each Model is specified by an ontology of data: it defines the machine learning task of the Model Runs inside the directory.
Model runA Model Run is a model training experiment within a Model directory. Each Model Run has its data snapshot (data rows, annotations, and data splits) versioned. You can upload predictions to a Model Run, and compare its performance against other Model Runs in the Model directory.
Nested classificationA classification-type annotation that is nested within an object-type annotation (as opposed to a global classification).
OntologyA collection of Features and their relationships (also known as a taxonomy). Ontologies can be reused across different projects. It is essential for data labeling, model training, and evaluation. When you are in the Editor, the ontology is what appears in the "Tools" panel.
PredictionOutput from your ML model that you can add to a Data Row to serve as a template for faster labeling.
ProjectThe labeling environment in Labelbox, like a factory assembly line for producing labels. The initial state of the project can start with raw data, pre-existing ground truth, or pre-labeled data.
QueueLabelbox has 3 queues for moving data rows through the labeling & QA pipeline: the Batches queue, the Labeling queue, and the review tasks queue.
SchemaNearly everything in Labelbox is strongly typed. The schema is the master blueprint for your training data. It contains Ontologies, Features, and Metadata.
TemplateIf you have a Data Row that needs to be relabeled, you can delete the annotations and select to have the existing annotations as a template the next time the Data Row appears in the Editor. This allows you to make corrections to a set of annotations, rather than start from scratch.
WorkflowA workflow refers to the queue system for labeling and reviewing assets within a project. Workflows give you more granular control over how your data rows get reviewed by providing a highly customizable, step-by-step review pipeline to drive efficiency and automation into your review process.
WorkspaceEnables admins at large organizations to manage multiple instances of Labelbox with the same login.