How to upload custom embeddings
Custom embeddings improve data exploration by improving similarity search. You can upload up to ten (10) custom embedding types per workspace on any data type. Use this to experiment with different embeddings to improve data selection.Before you start
This example requires the following libraries:Replace API key
Select data rows
First, we need to fetch data rows from a Labelbox dataset. To improve similarity search, you need to upload custom embeddings to at least 1,000 data rows.Create custom embedding payload
To prepare the data:1
Generate random vectors for embeddings (max:
2048
dimensions)2
List custom embeddings in your Labelbox workspace:
3
Choose an existing embedding type or create a new one A unique custom embedding name is required as an argument for this method.
4
Create payload
-
The payload should encompass the
key
(data row id or global key) and the new embedding vector data. Note that thedataset.upsert_data_rows()
operation will only update the values you pass in the payload; all other existing row data will not be modified.
Upload payload
1
Upsert data rows with custom embeddings
2
Get the count of imported vectors for a custom embedding typeAn updated count can take a few minutes, depending on the number of data rows associated with the embedding type.
3
Delete custom embedding type.
Upload custom embeddings during data row creation
1
Create a dataset
2
Fetch an embedding type and create dummy vector data.
3
Upload data rows with embeddings.