Data Row

A developer guide for creating and modifying data rows via the Python SDK.

Client

import labelbox as lb
client = lb.Client(api_key="<YOUR_API_KEY>")

Get a data row

data_row = client.get_data_row("<data_row_id>")

data_row = client.get_data_row_by_global_key("key1")

data_row_ids = get_data_row_ids_for_global_keys(["key1", "key2"])

Assign global keys

global_key_data_row_inputs = [
  {"data_row_id": "<data_row_id>", "global_key": "key1"},
  {"data_row_id": "<data_row_id>", "global_key": "key2"}
]

client.assign_global_keys_to_data_rows(global_key_data_row_inputs)

Clear global keys

client.clear_global_keys(["key1", "key2"])

Fundamentals

Create data rows

Data rows are created via methods of the Dataset class. For complete details and additional examples of approaches for creating data rows, please see Dataset.

The only required argument when creating a data row is the row_data. However, Labelbox strongly recommends supplying each data row with a global key upon creation.

# this example uses the uuid package to generate unique global keys
from uuid import uuid4

dataset.create_data_rows(
  [
    {
      "row_data": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/basic.jpg",
        "global_key": str(uuid4())
    },
    {
      "row_data": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/basic.jpg",
      "global_key": str(uuid4())
    }
  ]
)

Export data rows

data_rows = dataset.export_data_rows()

# optionally, you can include metadata in the export
data_rows = dataset.export_data_rows(include_metadata=True)

Methods

Create an attachment

TypeValueDescription
IMAGEURL of an image (PNG/JPG) (HTTPS or IAM delegated access path)Labelers can see the attached image(s) while labeling the primary data row.
VIDEOURL of a video (MP4) (HTTPS or IAM delegated access path)Labelers can see the attached video(s) while labeling the primary data row.
RAW_TEXTText string or hyperlinkLabelers can see the attached text or hyperlink while labeling the primary data row.

If passing a hyperlink, it will be clickable. If you want to display text from a URL endpoint, please use the TEXT_URL type.
TEXT_URLURL of a text file (HTTPS or IAM delegated access path)Labelers can see the attached text from the linked text URL while labeling the primary data row.
HTMLURL of an HTML file (HTTPS or IAM delegated access path)Renders HTML in an iframe as an attachment. Labelers can see and interact with the attached HTML widget while labeling the primary data row.
IMAGE_OVERLAYURL of the image layer (PNG/JPG) (HTTPS or IAM delegated access path).A visualization tool designed to help you view images in different ways by adding layers over the asset.

Image overlays can only be attached to image assets.

For details on creating attachments in the same step as creating data rows, see Dataset.

data_row.create_attachment(
  attachment_type="<attachment_type>",		# specify a type from the table above
  attachment_value="<attachment_value>",	# provide a value of the appropriate type
  attachment_name="<attachment_name>"		# name the attachment for reference
)

Get the winning label ID

For more details on what a "winning" label is and how it is chosen, see Consensus.

data_row.get_winning_label_id(project_id="<project_id>")

Update a data row

You can update any of the row_data, global_key, or external_id for an existing data row.

data_row.update(
  row_data="<new_row_data>",
  global_key="<new_unique_global_key>",
  # external IDs are soon to be deprecated, use global keys instead
  external_id="new_external_id"
)

Delete data rows

❗️

Deleting data rows cannot be undone

These methods delete data rows along with all labels made on each data row. This action cannot be reverted without the assistance of Labelbox support.

# delete one data row
data_row.delete()

# bulk delete data rows -- takes a list of Data Row objects
lb.DataRow.bulk_delete(data_rows=[<DataRow>, <DataRow>])

# for example, delete the data rows in a dataset, but not the Dataset object
dataset = client.get_dataset("<dataset_id>")
lb.DataRow.bulk_delete(data_rows=list(dataset.data_rows()))

📘

Limit on bulk deleting data rows

The lb.DataRow.bulk_delete() method can delete a maximum of 4,000 data rows per call.


Attributes

Get the basics

# global key (str)
data_row.global_key

# external ID (str) -- soon to be deprecated, use global keys instead
data_row.external_id

# row data (str)
data_row.row_data

# media attributes (dict)
data_row.media_attributes

# updated at (datetime)
data_row.updated_at

# created at (datetime)
data_row.created_at

# created by (relationship to User object)
user = data_row.created_by()

# organization (relationship to Organization object)
organization = data_row.organization()

# dataset (relationship to Dataset object)
dataset = data_row.dataset()

Get the attachments

# relationship to many AssetAttachment objects
attachments = data_row.attachments()

# inspect one attachment
next(attachments)

# inspect all attachments
for attachment in attachments:
  print(attachment)

# for ease of use, you can convert the paginated collection to a list
list(attachments)

Get the metadata

# get the metadata fields associated with the data row (list)
data_row.metadata_fields

# get the metadata fields as DataRowMetadataField objects (list)
data_row.metadata

Get the labels

# relationship to many Label objects
labels = data_row.labels()

# inspect one label made on the data row
next(labels)

# inspect all labels made on the data row 
for label in labels:
  print(label)

# for ease of use, you can convert the paginated collection to a listlabel
list(labels)