Filtering and sorting

Labelbox enables users to slice and dice their data. Powerful filtering and sorting helps with managing massive amounts of data, surfacing high-impact subsets and removing unwanted data.


Supported attributes for search and filter

Here are the supported search and filter capabilities in Catalog.

AnnotationFilter on annotations created on or uploaded to LabelboxShow images where X was annotated
Predictions (coming soon)Filter on predictions uploaded to model runsShow images where a model detected X
DatasetFilter the dataset that data rows belong toShow all images uploaded to dataset X
MetadataMetadata fields uploaded by the userThe datetime an image was captured
Project statusStatus with respect to the projectData rows not submitted to a particular project
SimilarityFilter by a function scoreUse similarity to find data for labeling
Natural Language SearchFilter based on natural languageUse NL search to find all "photo of birds in grass fields"
Media attributesAttributes of the data computed on upload. Each media type has different fields.Media type: Image, Video, Text,...
Video duration

Construct a filter

Think of creating a filter like constructing a pyramid with layers of logical sequence. Each layer is an AND operation. Within a layer, you can use OR operations. Each filter provides a count of data rows or annotations that match the filter. Only non-zero counts of instances of an attribute are available for selection and provided as a hint.

Here is a realistic example to help you understand filter construction.

An ML engineer is developing an AI model to identify vessels on synthetic aperture radar satellite imagery. The engineer learns that the model performs poorly on images containing coastlines. So the engineer queries for images that are at least 200px wide AND belong to the dataset named "SAR dataset (chipped)".

Then, the engineer queries for images that are more similar to the images used to create a function named "coastal images". The results are images with a coastline.

The engineer then tunes the function parameter that results in images that are dissimilar to the images used to create a function named "coastal images". The results are images without any coastline.

Then, the engineer queries for images that are not labeled (do not contain annotation named "ship").

Finally, the engineer can sample 100 random images from the results and submit the batch to a labeling project.