Functions (beta)

Programatically label data to curate datasets

🚧

Beta feature

Functions are an experimental feature and may change before GA

Overview

Functions are used to programmatically label data. While functions are typically less accurate than models or human annotators, they can quickly enrich data to help with curation and exploration.

Similarity functions

Similarity functions allow you label to label data using embeddings. You can use functions to find rare classes or patterns not described by metadata.

Create

A function is defined by a selection of data rows with embeddings. To create a new function, click Find Similar Data Rows in Catalog. The actively selected data rows and the chosen embedding metadata field will be used to define the function. Once you have created the function it will start a background task to score all of the data rows that share the embedding.

Creating a new function from CatalogCreating a new function from Catalog

Creating a new function from Catalog

Filter

You can filter data rows by the similarity score computed by your functions. Scores closer to 1 indicate greater similarity. You can mix function queries with all other filter types in Catalog. To sample data rows for labeling from Catalog, use Batch queues.

Filter Catalog with functionsFilter Catalog with functions

Filter Catalog with functions

Limits

Tier

Limit

Free

3

Education

10

Pro

20

Enterprise

50


Did this page help you?