Developer guide for creating and modifying batches via the Python SDK.
Project
class.
When creating a batch to send to a project, one of either global_keys
or data_rows
must be supplied as an argument. If using the data_rows
argument, you can supply either a list of data row IDs or a list of DataRow
class objects.
Optionally, you can supply a priority
field to control the labeling priority of the batch. The priority
field accepts 32-bit integer values, which means you can set the priority to any integer value between -2,147,483,648
to 2,147,483,647
. (For practical purposes, we recommend against using large priority values.)
Setting the priority
will determine the order in which the included data rows appear in the labeling queue compared to other batches. If no value is provided, the batch will assume the lowest priority.
Note: You can use the SDK to set the priority of individual data rows (see the Modify data row priority section in the Project overview).
project.create_batches()
method accepts up to 1 million data rows. Batches are chunked into groups of 100k data rows (if necessary), which is the maximum batch size.
This method takes in a list of either data row IDs or DataRow
objects into a data_rows
argument or global keys into a global_keys
argument, but both approaches cannot be used in the same method. Batches will be created with the specified name_prefix
argument and a unique suffix to ensure unique batch names. The suffix will be a 4-digit number starting at 0000
.
For example, if the name prefix is demo-create-batches-
and three batches are created, the names will be demo-create-batches-0000
, demo-create-batches-0001
, and demo-create-batches-0002
. This method will throw an error if a batch with the same name already exists.
project.create_batches_from_dataset()
method.
This method takes in a dataset ID and creates a batch (or batches if there are more than 100k data rows) comprised of all data rows not already in the project. The same logic applies to the name_prefix
argument and the naming of batches as described in the section immediately above.
Project
class.