Create a batch for a project
First, make sure the project is in Batch mode. You can then sample data rows from an existing dataset, and add them to a project.
# Updated
from labelbox import Client
import random
import uuid
client = Client(api_key="<YOUR_API_KEY>")
project = client.get_project(PROJECT_ID)
# prepare some datarows and store them in dataset
dataset = client.create_dataset(name=DATASET_NAME)
uploads = []
# Generate data rows
for i in range(1,9):
uploads.append({
"row_data": f"https://storage.googleapis.com/labelbox-datasets/People_Clothing_Segmentation/jpeg_images/IMAGES/img_000{i}.jpeg",
"global_key": uuid.uuid1(),# unique identifier
})
# create dataset
dataset.create_data_rows(uploads)
# Create a sample of data row objects (optional)
sample = random.sample(list(dataset.export_data_rows()),5)
## Create a batch
batch = project.create_batch(
"first-batch",# Each batch in a project must have a unique name
sample, # Paginated collection of data row objects
5, # priority between 1(Highest) - 5(lowest)
consensus_settings={
"number_of_labels": 2,
"coverage_percentage": 0.1
}
)
Note
- Counts can take a few seconds to update after batch is added.
- If the SDK times out when running long queries, the batch will still be added to the project. Go into the project to get the accurate status.
Get a batch
# list batches in a project
for batch in project.batches():
print(batch.name)
# List project that batch is part of
batch.project()
We currently don't support SDK method to get a batch by name/id or list data rows in a batch. You can view your batch in the Data Row table of the project.
Archive a batch
# archiving batch removes all queued data rows from the project
batch.remove_queued_data_rows()
Delete batch
You need to first delete the labels in the batch before deleting the batch itself.
# set_labels_as_template=True will set the deleted labels as template for future re-labeling.
batch.delete_labels(set_labels_as_template=False)
batch.delete()