Developer guide for creating and modifying projects using the Python SDK.
media_type
using one of the following values:
lb.MediaType.Audio
lb.MediaType.Conversational
lb.MediaType.Document
lb.MediaType.Geospatial_Tile
lb.MediaType.Html
lb.MediaType.Image
lb.MediaType.Simple_Tile
lb.MediaType.Text
lb.MediaType.Video
global_keys
or data_rows
must be supplied as an argument. If using the data_rows
argument, you can supply either a list of data row IDs or a list of DataRow
class objects.
Optionally, you can supply a priority
, ranging from 1 (highest) to 5 (lowest), for which the batch should be labeled. This will determine the order in which the included data rows appear in the labeling queue compared to other batches. If no value is provided, the batch will assume the lowest priority.
For more details, see Batch.
project.create_batches()
method accepts up to 1 million data rows. Batches are chunked into groups of 100k data rows (if necessary), which is the maximum batch size.
This method takes in a list of either data row IDs or DataRow
objects into a data_rows
argument or global keys into a global_keys
argument, but both approaches cannot be used in the same method. Batches will be created with the specified name_prefix
argument and a unique suffix to ensure unique batch names. The suffix will be a 4-digit number starting at 0000
.
For example, if the name prefix is demo-create-batches-
and three batches are created, the names will be demo-create-batches-0000
, demo-create-batches-0001
, and demo-create-batches-0002
. This method will throw an error if a batch with the same name already exists.
project.create_batches_from_dataset()
method.
This method takes in a dataset ID and creates a batch (or batches if there are more than 100k data rows) comprised of all data rows not already in the project. The same logic applies to the name_prefix
argument and the naming of batches as described in the section immediately above.
client.get_batch()
.
access_from
from the class ProjectMember
.
It can have one of the following values:
DataRow
class or a DataRowIdentifier
object, the new priority. All values must be integers that match the range of the list.
The priority is an integer between -2,147,483,648
to 2,147,483,647
. The lowest value has the highest priority.
Override lists are limited to 1,000 items; larger lists trigger an error.
Once the override list is defined, pass it to project.set_labeling_parameter_overrides
to change the priority of the corresponding data rows. Use project.labeling_parameter_overrides
to get a list of data row priorities and project.update_data_row_labeling_priority
to update existing data row priority.
tags
variable is a list where each element is an object of type ResourceTag
with the attributes, uid
, color
(ex: “008856”) andtext
.
project.get_overview(details)
you can obtain some of the data from the Project Overview tab.
details
will change the output to display the distribution of data rows between the queues.
When details
is to false:
Attribute | Description | Name in the Overview tab |
---|---|---|
to_label | Number of data rows that are yet to be labeled | To Label |
in_review | Number of data rows to be reviewed | In Review |
in_rework | Number of data rows to be reworked | In Rework |
skipped | Number of skipped data rows | Skipped |
done | Number of data rows marked as Done | Done |
issues | Number of data rows with associated issues | Issues |
labeled | Number of data rows with one or more labels | - |
total_data_rows | Total number of data rows in the project | - |
details
is set to true, the output will be the same as before, except for the following:
Attribute | Description |
---|---|
in_review | data : List of task queues in review with the associated number of data rowstotal : Number of data rows to be reviewed |
in_rework | data : List of task queues in rework with the associated number of data rowstotal : Number of data rows to be reworked |
Attribute | Sum of attributes |
---|---|
overview.labeled | overview.in_review + overview.in_rework + overview.done |
overview.total_data_rows | overview.to_label + overview.in_review + overview.in_rework + overview.done |
project.get_mal_prediction_imports()
to retrieve the list of MAL import jobs.project.get_label_imports()
to to retrieve the list of ground-truth import jobs.MALPredictionImport.delete()
to delete a MAL import.MALPredictionImport.delete()
method can only delete MAL imports. To delete a ground-truth label, use the Data Rows tab on the web platform. Deleting an import is permanent and can’t be undone.project.labeling_parameter_overrides
to get a list of labeling parameter overrides (LPOs), which define the priority for each label in the override list. Use set_labeling_parameter_overrides
and update_data_row_labeling_priority
to modify data row priority.
project.get_label_count()
to return the sum of labels in the different task queues of a project.
client.send_to_annotate_from_catalog
method with our Labelbox client.
Send to Annotate does not currently support consensus projects.
source_project_id
will need to be provided:
source_project_id
annotation_ontology_mapping
{"<source_feature_schema_id>" : "<destination_feature_schema_id>"}
exclude_data_rows_in_project
override_existing_annotations_rule
ConflictResolutionStrategy.KeepExisting
ConflictResolutionStrategy.KeepExisting
ConflictResolutionStrategy.OverrideWithPredictions
ConflictResolutionStrategy.OverrideWithAnnotations
param batch_priority