Labelbox recommends using the Batch-based queue to organize your annotation workflows.

When you create a new project, Labelbox will prompt you to select the Batch-based or the Dataset-based queueing system to send your Data Rows for labeling. Queueing Data Rows via the Batch queue gives you more flexibility and control of your workflow than queueing an entire Dataset.

❗️

Caution

We currently do not support consensus or benchmarks with batches. Using consensus or benchmarks with batches will cause undesirable and irreversible behaviors in your project, such as data loss or label duplication.

Switching between Batch and Dataset mode for queueing data, after labeling has started, will reset your project including removing ALL data rows previously queued from the project. Note that this can result in lost work.

Please reach out to [email protected] for any further questions.

Create a batch

To send a batch of data to a project for labeling, use the following steps:

  1. Go to the Catalog and filter for the relevant Data Rows you want to label.

  2. Once you have the desired Data Rows in the result, either choose Select all or manually select Data Rows to include in your batch. Once you have the Data Rows selected, hit the blue button at the top of the screen and select Add batch to project

  3. From the batch creation modal, choose a project, give the batch a name, and select a priority for the Data Rows. When setting Data Row priority, 1 is the highest and 5 is the lowest. You may see a warning that some of the Data Rows have already been submitted. Data Rows that have already been submitted will be excluded when you click submit.

  4. Click Submit batch.

  5. Navigate to the project to begin labeling. Remember to make sure the project is in Batch mode to access these new Data Rows for labeling.

12441244

Example workflow of creating a batch

📘

Data Row distribution

Newly queued Data Rows will be distributed after any Data Rows in the label queue have already been reserved.

🚧

Batches enforce data type

Only assets that matches the project data type will be added to your batches.

13861386

Create a batch using sampling

Manually selecting data for labeling is a time-consuming process. You can use sampling to make the data selection process faster and easier.

Within Catalog, you can sample from the results of a filter. Sampling can be performed in random or non-random order. To create a batch with the sampling technique, follow these steps:

  1. Go to the Catalog tab and filter for the relevant Data Rows you want to label. Once you have the relevant Data Rows, click the Sample button on the top right.
24762476
  1. From the batch creation modal, in addition to filling out the project, batch name, and priority details, you can choose how many Data Rows to sample and the sampling method. Currently, we support two sampling methods: random and ordered. For more information on sampling methods, please visit Sampling methods.
24742474
  1. Click Submit batch.

  2. Navigate to the project to begin labeling. Remember to make sure the project is in Batch mode to access these new Data Rows for labeling.

SDK methods

See the SDK reference for additional details

Set queue mode

Since the introduction of batches in May 2022, it is now the preferred method to select and queue data for labeling. Users who started using Labelbox prior to this date are recommended to transition to using batches.

# Update your project to queue data using batches
project.update(queue_mode=project.QueueMode.Batch)

Create a batch

A batch can be created with a list of Data Row IDs. To successfully create a batch:

  1. Data Row IDs should be unique
  2. Selected Data Rows should not be already queued in the desired labeling project
  3. Batch name should be unique
  4. A maximum of 25,000 Data Rows can be added to a batch at a time
  5. A priority level between 1 (highest) and 5 (lowest) must be provided.
## Find data rows from a dataset that is not already queued in a project
queued_data_rows = [dr['id'] for dr in list(project.export_queued_data_rows())]
data_rows = [dr.uid for dr in list(dataset.export_data_rows())]
not_queued_data_rows = list(set(data_rows)- set(queued_data_rows))

# Randomly sample 100 data rows
sample = random.sample(not_queued_data_rows, 100)

# Create the batch
batch = project.create_batch(
  "a new batch", # Each batch in a project must have a unique name
  sample, # A list of data rows ids
  5 # priority between 1(Highest) - 5(lowest)
)

List batches in a project

List all batches associated with a project.

# list batches in a project
for batch in project.batches():
    print(batch.project().name)

# List project that batch is part of
batch.project()

Archive batch (remove queued data from a project)

You can remove queued Data Rows from a project by archiving a batch associated with the respective project.

# archiving batch removes all queued data rows from the project
batch.remove_queued_data_rows()

Delete batch

You need to first delete the labels in the batch before deleting the batch itself.

# set_labels_as_template=True will set the deleted labels as template for future re-labeling. 
batch.delete_labels(set_labels_as_template=False)
batch.delete()

Complete Python SDK tutorial

FAQs

Can I submit the same Data Row multiple times?

A Data Row cannot be part of more than one Batch in a Project at a time.

Can a Batch be shared between projects?

A Batch cannot be shared between Projects. However, you can create a new Batch using the same Data Rows.

Can I append Data Rows to a Batch?

Once a Batch has been submitted you cannot add more Data Rows. We are considering adding this functionality in the future.

How many Data Rows can be in a Batch?

25,000 Data Rows. See Limits for more information.


Did this page help you?