Get Catalog slice via SDK

You can retrieve a slice's data rows and all associated information programmatically via our Python SDK. From there, you can use Catalog to visually inspect the data rows you retrieved via the SDK.

Retrieving a slice programmatically is a convenient way to programmatically curate a new batch or a model run dataset from a saved slice directly.

3808

First, you will need to copy the slice ID from Catalog UI.

catalog_slice_id = "<CATALOG_SLICE_ID_FROM_UI>"
catalog_slice = client.get_catalog_slice(catalog_slice_id) #-> CatalogSlice
print(catalog_slice) # list the filter used for catalog slice. 
# --
# <CatalogSlice {'created_at': datetime.datetime(2022, 10, 24, 18, 47, 43, 666000, tzinfo=datetime.timezone.utc), 'description': None, 'filter': [{'ids': ['cl6wheen01ucx0y169n8v2m3g'], 'type': 'project', 'operator': 'is'}], 'name': 'test slice', 'uid': 'cl9n4smy906h60yy6cy8f37wb', 'updated_at': datetime.datetime(2022, 10, 24, 18, 47, 43, 666000, tzinfo=datetime.timezone.utc)}>

# Get data row ids in a slice
slice_data_rows_ids = catalog_slice.get_data_row_ids()
for data_row_id in slice_data_rows_ids:
  print(client.get_data_row(data_row_id))

Curate a batch from a Catalog slice via SDK

You can create a new batch from your slice or create a random sample from a slice using our Python SDK. See the Python example below to learn how to do this.

# Optional: sample Data rows from your Slice
sampled_data_row_ids = random.sample(slice_data_rows_ids, 5)

batch = project.create_batch(
  "test batch", # name of the batch
  sampled_data_row_ids, # list of Data Rows
  1 # priority between 1-5
)

You can append data rows to your model runs for inference from your slice. See the Python example below to learn how to do this.

model_run.upsert_data_rows(list(slice_data_rows_ids))