Complete list of definitions for the fields included in exports.
Labelbox exports give you more flexibility and control to retrieve the most valuable information from your projects. You can now select and export a subselection of the data rows of most interest based on your predefined or new parameters in the Data Rows tab. You can also export more detailed information from these data rows, and include or exclude relevant attributes in your export. We also simplified and standardized the annotation formats.
For more details on how to create exports, along with complete samples, please view the following pages:
Below is a glossary of the fields that can appear in an export.
Export fields
Labelbox exports aligns with the data-row-centric paradigm, thus every line, regardless of the export time, will include information on each included data row.
Field | Included | Description |
---|---|---|
data_row | Always | A dictionary containing the fields explained the data_row table below. |
media_attributes | Optional | See Media attributes |
attachments | Optional | See Attachments |
metadata_fields | Optional | See Metadata |
embeddings | Optional | A list of dictionaries containing precomputed and custom embeddings |
data_row
data_row
Field | Included | Description |
---|---|---|
id | Always | The ID of the data row. |
global_key | If utilized | The global key of the data row. |
external_id | If utilized | The external ID of the data row. |
row_data | Always | The URL to your cloud-hosted file |
details | Optional | A dictionary containing the fields explained in the details table below. When exporting through the SDK, these fields are included by setting the data_row_details parameter to True . |
details
details
Field | Included | Description |
---|---|---|
dataset_id | Always | The ID of the dataset to which the data row belongs. |
created_at | Always | A timestamp that indicates when the data row was created. |
updated_at | Always | A timestamp indicating when the data row was most recently updated. |
last_activity_at | Always* | A timestamp indicating the last time any activity was performed on the data row. |
created_by | Always | The email address of the user that created the data row. |
* The last_activity_at
field is not included in exports from a model run.
projects
projects
The projects
field contains a dictionary in which the keys are project IDs and the values consist of the fields explained below.
In a project-based export, there will only be a singular project ID in this dictionary. Yet, when exporting from the Catalog, consider that a data row may have been labeled in multiple projects, and thus the dictionary will have multiple keys.
Project IDs are used in favor of project names in order to enforce uniqueness.
Field | Included | Description |
---|---|---|
project_name | Always | The name of the project in which the data row was labeled. |
labels | Always | Contains a list of dictionaries comprised of the fields explained in the labels table below. |
project_details | Optional | A dictionary containing the fields explained in the project_details table below. |
labels
labels
Field | Included | Description |
---|---|---|
label_kind | Always | For labels made on assets of most media types, the value is Default . For frame-based assets, the value is Video . |
version | Always | Used to track updates made to export formats. At present, the value will always be 1.0.0 . |
id | Always | The ID of the label. |
label_details | Optional | A dictionary containing the fields explained in the label_details table below. |
performance_details | Optional | A dictionary containing the fields explained in the performance_details table below. |
annotations | Always | See the annotation export formats broken down by asset type, beginning here with images. |
label_details
label_details
Field | Included | Description |
---|---|---|
created_at | Always | A timestamp indicating when the label was created. |
updated_at | Always | A timestamp indicating when the label was most recently updated. |
created_by | Always | The email address of the user that created the label. |
reviews | Always | [Legacy] For projects using Workflows for review, please use workflow_history here. Information on the thumbs up/down reviews created on this label. Contains a list of dictionaries comprised of the fields explained in the reviews table below. |
reviews
(for legacy only)
reviews
(for legacy only)Field | Included | Description |
---|---|---|
reviewed_at | Situational | A timestamp that indicates when the review was created. |
reviewed_by | Situational | The email address of the user that created the review. |
review_action | Situational | The type of review created; either Approve or Reject . |
performance_details
performance_details
Field | Included | Description |
---|---|---|
seconds_to_create | Always | The number of seconds spent creating the label. After label submission, there could be a 30-minute delay before this field gets updated. |
seconds_to_review | Always | The number of seconds spent reviewing the label. After completing reviewing, there could be a 30-minute delay before this field gets updated. |
skipped | Always | A boolean value expressing whether or not the asset was skipped. A value of true indicates the asset was skipped. |
benchmark_reference_label | Situational | The ID of the "gold standard" benchmark label to which this label is compared. |
benchmark_score | Situational | The agreement score between the label and the associated benchmark label. After label submission or updates, there could be a 30-minute delay before this field gets updated. |
consensus_score | Situational | The agreement score between the label and the associated consensus labels made on the same data row. After label submission or updates, there could be a 30-minute delay before this field gets updated. |
consensus_label_count | Situational | The number of labels created on this data row in this project. |
consensus_labels | Situational | The IDs of the labels created on this data row in this project. |
annotations
annotations
Field | Included | Description |
---|---|---|
objects | Always | Tool annotations (e.g., bounding box, masks, polygon, etc ..) |
classifications | Always | Classification annotations (radio, checklist) |
frames | Videos only | Dictionary with per-frame annotations (objects and classifications) |
segments | Videos only | A dictionary where each key is a feature_id , and the corresponding value is a list of frame numbers representing the range of frames where that feature exists. |
key_frame_feature_map | Videos only | A dictionary where each key is a feature_id , and the corresponding value is a list of frame numbers representing the frames where that feature exists. |
feature_id | Always | Unique identifier for an annotation in a label; this id is also present on all classification answers. |
feature_schema_id | Always | Unique identifier for an ontology tool or classification; this id is also present on all classification answers. |
annotation_kind | Situational | The kind of tool utilized by the annotation (e.g. ImageBoundingBox, ImagePolyline) |
name | Always | Name given to the tool or classification |
value | Always | Normalized name of the tool or classification: if a user creates a tool named "Apple Pear," this field will show the normalized version "apple_pear." |
project_details
project_details
Field | Included | Description |
---|---|---|
ontology_id | Always | The ID of the ontology connected to the project. |
task_id | Always | The ID of the task the data row currently is in in the project. |
task_name | Always | The name of the task the data row currently is in in the project. |
batch_id | Always | The ID of the batch in which the data row was sent to the project. |
batch_name | Always | The name of the batch in which the data row was sent to the project. |
workflow_status | Always | The status of the data row in the project (either TO_LABEL , IN_REWORK , IN_REVIEW , or DONE ) |
priority | Always | The priority assigned to the batch. |
selected_label_id | If utilized | The ID of the label that was selected as the "winner" amongst the labels made on the data row. |
consensus_expected_label_count | Always | The number of labels that were expected to be created on this data row according to the consensus settings. |
workflow_history | Always | Information on the progression of the labeled data row through the project's workflow. Contains a list of dictionaries comprised of the fields explained in the workflow_history table below. |
workflow_history
workflow_history
Field | Included | Description |
---|---|---|
action | Always | The action that was performed on the data row in this specific step. Typical actions are: Move : change of task queueAccept or Reject : selection of a review scoreRework : follows Reject |
created_at | Always | A timestamp that indicates when this action on the data row occurred. |
created_by | Always | The email address of the user that performed this action. |
previous_task_name | Situational | The name of the workflow task in which the data row began this action. |
previous_task_id | Situational | The ID of the workflow task in which the data row began this action. |
next_task_name | Situational | The name of the workflow task in which the data row concluded this action. |
next_task_id | Situational | The ID of the workflow task in which the data row concluded this action. |
experiments
experiments
The experiments
field contains a dictionary in which the keys are model experiment IDs, and the values consist of the name
and a dictionary of model runs
.
In a model-based export, only a singular model ID will be in this dictionary. Yet, when exporting from the Catalog, consider that a data row may be included in multiple models, and thus the dictionary will have multiple keys.
Model experiment IDs are used in favor of model names to enforce uniqueness.
Field | Included | Description |
---|---|---|
name | Always | The name of the model in which the data row appears. |
runs | Always | A dictionary where the keys are the IDs of the model runs in which the data row appears. |
runs
runs
The model runs
field contains a dictionary in which the keys are model run IDs, and the values consist of the fields explained below.
Model run IDs are used in favor of model run names to enforce uniqueness.
Field | Included | Description |
---|---|---|
name | Always | The name of the model run. |
run_data_row_id | Always | A unique ID for the data row in the context of the model run. |
labels | Situational | Ground truth annotations can optionally be sent to a model run as labels. If present, the labels will appear in the same format as in a project-based export, which is detailed in the labels table above, though the optional label_details and performance_details sections are always excluded in the context of a model run. |
predictions | Optional | A dictionary containing the fields explained in the predictions table below. |
split | Situational | The value will be included if the data row is assigned to a split. The potential outputs are Training , Validation , orTest .When exporting through the SDK, these fields are included by setting the model_run_details parameter to True . |
predictions
predictions
Field | Included | Description |
---|---|---|
label_kind | Always | For labels made on assets of most media types, the value is Default . For frame-based assets, the value is Video . |
version | Always | Used to track updates made to export formats. At present, the value will always be 1.0.0 . |
id | Always | The ID of the set of predictions. |
annotations | Always | See the annotation export formats broken down by asset type, beginning here with images. |