Send data to labeling (batch)

Labelbox recommends using the batch-based queue to organize your annotation workflows.
Queueing data rows via the batch queue gives you more flexibility and control of your workflow than queueing an entire dataset.

Create a batch

The prerequisite to creating a batch within a project is the project being configured in batch mode. To send a batch of data to a project for labeling, use the following steps:

  1. Go to the Catalog and filter for the relevant data rows you want to label.

  2. Once you have the desired data rows in the result, either choose Select all or manually select data rows to include in your batch. Once you have the data rows selected, hit the blue button at the top of the screen and select Add batch to project

  1. From the batch creation modal, choose a project, give the batch a name, and select a priority for the data rows. When setting data row priority, 1 is the highest and 5 is the lowest. You may see a warning that some of the data rows have already been submitted. Data rows that have already been submitted will be excluded when you click submit.
  2. Click Submit batch. The, navigate to the project to begin labeling.

📘

Data row distribution

Newly queued data rows will be distributed after any data rows in the label queue have already been reserved.

🚧

Batches enforce data type

Only assets that matches the project data type will be added to your batches.

Create a batch by sampling

Manually selecting data for labeling is a time-consuming process. You can use sampling to make the data selection process faster and easier.

Within Catalog, you can sample from the results of a filter. Sampling can be performed in random or non-random order. To create a batch with the sampling technique, follow these steps:

  1. Go to the Catalog tab and filter for the relevant data rows you want to label. Once you have the relevant data rows, click the Sample button on the top right.

  1. From the batch creation modal, in addition to filling out the project, batch name, and priority details, you can choose how many data rows to sample and the sampling method. Currently, we support two sampling methods: random and ordered. For more information on sampling methods, please visit Sampling methods.
2474
  1. Click Submit batch.

  2. Navigate to the project to begin labeling. Remember to make sure the project is in batch mode to access these new data rows for labeling.

SDK methods

See the SDK reference for additional details

Set queue mode

Since the introduction of batches in May 2022, it is now the preferred method to select and queue data for labeling. Users who started using Labelbox prior to this date are recommended to transition to using batches.

# Update your project to queue data using batches
project.update(queue_mode=project.QueueMode.Batch)

Create a batch

You can create a batch with a list of data row IDs. To successfully create a batch:

  1. Data rows should be unique
  2. Selected data rows should not be already queued in the desired labeling project
  3. Batch name should be unique
  4. A maximum of 100,000 data rows can be added to a batch at a time
  5. A priority level between 1 (highest) and 5 (lowest) must be provided.

📘

Note

  • Counts can take a few seconds to update after batch is added.
  • If the SDK times out when running long queries, the batch will still be added to the project. Go into the project to get the accurate status.
# UPDATED
## Find data rows from a dataset that are not already queued in a project
queued_data_row_ids = [dr['id'] for dr in project.export_queued_data_rows()]
not_queued_data_rows = [dr for dr in dataset.export_data_rows() if dr.uid not in queued_data_row_ids]


# Create a batch
batch = project.create_batch(
  "first-batch",# Each batch in a project must have a unique name
    not_queued_data_rows, # Paginated collection of data row objects
  5 # priority between 1(Highest) - 5(lowest)
)

List batches in a project

List all batches associated with a project.

# list batches in a project
for batch in project.batches():
    print(batch.name)

# List project that batch is part of
batch.project()

Archive batch (remove queued data from a project)

You can remove queued data rows from a project by archiving a batch associated with the respective project.

# archiving batch removes all queued data rows from the project
batch.remove_queued_data_rows()

Delete batch

You need to first delete the labels in the batch before deleting the batch itself.

# set_labels_as_template=True will set the deleted labels as template for future re-labeling. 
batch.delete_labels(set_labels_as_template=False)
batch.delete()

FAQs

Can I submit the same data row multiple times?

A data row cannot be part of more than one batch in a project at a time.

Can a batch be shared between projects?

A batch cannot be shared between projects. However, you can create a new batch using the same data rows.

Can I append data rows to a batch?

Once a batch has been submitted you cannot add more data rows. We are considering adding this functionality in the future.

How many data rows can be in a batch?

100,000 data rows. See Limits for more information.

How many batches can be added to a project?

There is no limit to batches that have been labeled on a project. But there is a limit of 1,500 unlabeled batches on a project. See Limits for more information