You can think of the Catalog as a training data warehouse where you can use filters to browse, curate, and develop insights across all of your labeled and unlabeled data in Labelbox. Prior to the Catalog, your view of your Data Rows was restricted by the Datasets they belonged to (i.e., you could only view one Dataset at a time). The Catalog removes that barrier, allowing you to browse your Data Rows across all Datasets in one place.

The Catalog is particularly helpful if you need a global view of your Data Rows (with or without annotations) in order to get a more holistic understanding of your training data. Use the filters to determine which Data Rows you want to select for your next labeling project.


All the data you upload or create on Labelbox is available in Catalog. You can filter and explore your data in the Catalog using the following features.

  1. Annotations
  2. Dataset
  3. Metadata
  4. Media attributes
  5. Similarity


Selecting the right data label is one of the most important parts of any machine learning workflow. Using Catalog, you can select data to send to Projects for labeling. First, you will want to use filters and similarity to find the best data. Once you have found the data of interest you can select it and send the selected datarows to a project for labeling. See the docs on Batch Queues to learn more about this feature.

Did this page help you?