Create Metadata fields

You can create metadata from the available metadata schema in your organization. See Metadata in Schema section for details.

## Fetch metadata schema ontology. A Labelbox workspace has a single metadata ontology.
metadata_ontology = client.get_data_row_metadata_ontology()
# List all available fields
metadata_ontology.fields

To construct a metadata field you must provide the Schema Id for the field and the value that will be uploaded. You can either construct a DataRowMetadataField object or specify the Schema Id and value in a dictionary format.

## Construct a metadata field of string kind
tag_schema = metadata_ontology.reserved_by_name["tag"]
tag_metadata_field = DataRowMetadataField(
    schema_id=tag_schema.uid,  # specify the schema id
    value="tag_string", # typed inputs
)

# Construct an metadata field of datetime
datetime_schema = metadata_ontology.reserved_by_name["captureDateTime"]
capture_datetime_field = DataRowMetadataField(
    schema_id=datetime_schema.uid,  # specify the schema id
    value=datetime.datetime.utcnow(), # typed inputs
)

# Construct a metadata field of Enums options
train_schema = metadata_ontology.reserved_by_name["split"]["train"]
split_metadta_field = DataRowMetadataField(
    schema_id=train_schema.parent,  # specify the schema id
    value=train_schema.uid, # typed inputs
)

# Custom fields, must be created in UI prior to this
custom_field = metadata_ontology.custom_by_name["my-custom-field"]
custome_metadta_field = DataRowMetadataField(
    schema_id=custom_field.uid,  # specify the schema id
    value="custome_field_value", # typed inputs
)

Upload Data Rows with metadata

dataset = client.create_dataset(name="Bulk import example")

data_row = {"row_data": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/basic.jpg", "external_id": str(uuid4())}

# Option 1: Specify metadata with a list of DataRowMetadataField.
# This is the recommended option since it comes with validation for metadata fields.
data_row['metadata_fields'] = [tag_metadata_field, capture_datetime_field,  split_metadta_field]

# Option 2: Alternatively, you can specify the metadata fields with dictionary format 
# without declaring the DataRowMetadataField objects.
data_row['metadata_fields'] = [
  {"schema_id": tag_schema.uid, "value": "tag_string"},             
  {"schema_id": datetime_schema.uid, "value": datetime.datetime.utcnow()}, 
  {"schema_id": train_schema.parent, "value": train_schema.uid}, 
 ]

####

task = dataset.create_data_rows([data_row])
task.wait_till_done()

Get metadata

data_row = next(dataset.data_rows())
print(data_row.metadata_fields)

Export metadata

# Export metadata from a dataset
data_rows = dataset.export_data_rows(include_metadata=True)


# Export metadata from a list of data row ids.
metadata_ontology = client.get_data_row_metadata_ontology()
metadata = metadata_ontology.bulk_export([data_row.uid])

Update or add metadata of existing Data Rows

Labelbox supports individual or bulk metadata upsert of Data Rows. Metadata will overwrite on a per-field basis.

tag_schema = metadata_ontology.reserved_by_name["tag"]

# Construct an enum field
field = DataRowMetadataField(
    schema_id=tag_schema.uid,  # specify the schema id
    value="updated", # typed inputs
)

# Completed object ready for import
metadata_payload = DataRowMetadata(
    data_row_id="DATAROW_ID",  # DataRow Id not ExternalId
    fields=[field]
)

# Provide a list of DataRowMetadata objects to upload
metadata_ontology.bulk_upsert([metadata_payload])

Delete metadata

# Specify the schemas to delete
schemas = [tag_schema, ...]

# Create a delete object
deletes = DeleteDataRowMetadata(
    data_row_id=md.data_row_id,
    fields=[s.uid for s in schemas]
)

mdo.bulk_delete([deletes]) # pass an array of deletes