Overview
The Catalog is a data curation tool for organizing, searching, visualizing, and exploring labeled and unlabeled data (including any metadata). Teams developing and operating production AI systems need a data catalog to enable data selection for downstream data-centric workflows. This includes data labeling, model training, model evaluation, error analysis, and active learning.
Anatomy of the Catalog


Understanding data flow
Data can flow into and out of the Catalog in numerous ways. This diagram indicates how data can flow through the Catalog.


Data inflow
Catalog can ingest the following pieces of data. Catalog makes it easy to search, visualize, and explore the following data in one place.
Data type | Overview |
---|---|
Data Rows in Catalog are imported when a dataset is created or appended with Data Rows. | |
Custom metadata fields are imported during Data Row creation or update events. | |
Media attributes are a special class of metadata automatically pre-computed by Labelbox at Data Row creation or update events. The media attributes include file type, dimensions, and pre-computed embeddings. Media attributes are essential for your optimal experience with Labelbox. | |
Ground truth annotations are available in the Catalog when ground truth annotations are created using Overview: Annotate data or imported Import annotations. | |
Model predictions are not yet supported in Catalog. |
Data outflow
Data can flow out of the Catalog in two ways.
Type | Overview |
---|---|
Batch | Create a batch of Data Rows and send it to Annotate for labeling. |
Export | Use Python SDK to retrieve Data Row content (asset URL, media attributes, metadata). |
Annotation previews
While all ground truth can be viewed in the Editor, the Catalog additionally supports previewing ground truth annotations to accelerate deriving insights that lead to actions. Below are supported and planned annotation previews of data types.
Data Types | Annotation Types |
---|---|
Image | Supported |
Text, Video, Geospatial tiled imagery, Documents, DICOM | Coming soon |
Getting started with Catalog
-
Chose data configuration: IAM Integration or Signed URLs
Updated 2 months ago