> ## Documentation Index
> Fetch the complete documentation index at: https://docs.labelbox.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Metadata

> Developer guide for creating, importing, exporting, and modifying metadata fields via the Python SDK.

<CardGroup cols={2}>
  <Card title="Open in Colab" icon="infinity" iconType="solid" horizontal href="https://colab.research.google.com/github/Labelbox/labelbox-notebooks/blob/main/basics/data_row_metadata.ipynb" />

  <Card title="GitHub" icon="github" iconType="solid" horizontal href="https://github.com/Labelbox/labelbox-notebooks/blob/main/basics/data_row_metadata.ipynb" />
</CardGroup>

## Metadata types

All metadata needs to be one of the following types:

| Type            | Descriptions                                                                                                                            | Enum value                       |
| --------------- | --------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------- |
| `DateTime`      | An ISO 8601 datetime field. All times must be in UTC timezone                                                                           | `DataRowMetadataKind.datetime`   |
| `Number`        | Floating-point value (max: 64-bit float)                                                                                                | `DataRowMetadataKind.number`     |
| `String`        | Free text field. Max 4,096 characters.                                                                                                  | `DataRowMetadataKind.string`     |
| `Enum`          | Enum field with options. Multiple options can be imported.                                                                              | `DataRowMetadataKind.enum`       |
| `Option (Enum)` | Option of an enum. Max 64 options can be created per Enum type. 128 for enterprise customers and can be further increased upon request. | `DataRowMetadataKind.enumoption` |
| `Embedding`     | 128 float 32 vector used for similarity                                                                                                 | `DataRowMetadataKind.embedding`  |

## Reserved fields

The following field names are reserved and cannot be used as custom metadata field names:

| Name               | Type       | Description                                                                               |
| ------------------ | ---------- | ----------------------------------------------------------------------------------------- |
| `tag`              | `String`   | The tags of the data row                                                                  |
| `split`            | `Enum`     | The split of the dataset that the data row belongs, including`train`, `valid`, and `test` |
| `captureDateTime`  | `DateTime` | The timestamp when the data is captured                                                   |
| `skipNFrames`      | `Number`   | (Video data only) The number of frames to skip                                            |
| `turnInstructions` | `String`   | JSON string that contains instructions for each turn in a Multi-modal chat conversation.  |

## Construct metadata fields

To construct a metadata field, you must provide the Schema ID for the field and the value that will be uploaded. You can do this in two ways:

* Option 1: Specify the metadata using the *DataRowMetadataField* object (comes with validation for metadata fields)
* Option 2: Specify the metadata fields in *dictionary* format without declaring the DataRowMetadataField objects

The Metadata `value` attribute should not be `null`.

<CodeGroup>
  ```python DataRowMetadataField object theme={null}
  metadata_fields = []

  ## Construct a metadata field of string kind

  tag_schema = metadata_ontology.get_by_name("tag")
  tag_metadata_field = DataRowMetadataField(
  schema_id=tag_schema.uid, # specify the schema id
  value="tag_string", # typed inputs
  )
  metadata_fields.append(tag_metadata_field)

  # Construct an metadata field of datetime

  datetime_schema = metadata_ontology.get_by_name("captureDateTime")
  capture_datetime_field = DataRowMetadataField(
  name=datetime_schema.name, # specify the schema id
  value=datetime.datetime.utcnow(), # typed inputs
  )
  metadata_fields.append(capture_datetime_field)

  # # Construct a metadata field of Enums options. You can import multiple options.

  test_schema = metadata_ontology.get_by_name("split")["test"]
  test_schema_field = DataRowMetadataField(
  schema_id=test_schema.parent, # specify the schema id
  value=test_schema.uid, # typed inputs
  )
  metadata_fields.append(test_schema_field)

  valid_schema = metadata_ontology.get_by_name("split")["valid"]
  valid_schema_field = DataRowMetadataField(
  schema_id=valid_schema.parent, # specify the schema id
  value=valid_schema.uid, # typed inputs
  )
  metadata_fields.append(valid_schema_field)
  ```

  ```python Dictionary theme={null}
  metadata_fields = []

  ## Construct a metadata field of string kind
  tag_schema = metadata_ontology.get_by_name("tag")
  metadata_fields.append({"name": tag_schema.name, "value": "tag_value"})

  # Construct an metadata field of datetime
  datetime_schema = metadata_ontology.get_by_name("captureDateTime")
  metadata_fields.append({"name": datetime_schema.name, "value": datetime.datetime.utcnow()})

  ## Construct a metadata field of Enums options. You can import multiple options.
  train_schema = metadata_ontology.get_by_name("split")["valid"]
  metadata_fields.append({"schema_id": train_schema.parent, "value": train_schema.uid})

  test_schema = metadata_ontology.get_by_name("split")["test"]
  metadata_fields.append({"schema_id": test_schema.parent, "value": test_schema.uid})
  ```
</CodeGroup>

## Create custom metadata schema

<CodeGroup>
  ```python Python theme={null}
  import labelbox
  from labelbox.schema.data_row_metadata import DataRowMetadataKind

  client = labelbox.Client(api_key="LABELBOX_API_KEY")
  metadata_ontology = client.get_data_row_metadata_ontology()

  # create a custom metadata schema (set the kind value to a supported data type)

  metadata_schema = metadata_ontology.create_schema(name="metadata_name", kind=DataRowMetadataKind.string)

  # get the schema id

  schema_id = metadata_schema.uid
  ```
</CodeGroup>

## Upload data rows with metadata

Custom metadata field limits vary according to your subscription; for details, see [Limits](/docs/limits).

<CodeGroup>
  ```python Python theme={null}
  data_row = {
    "row_data": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/basic.jpg",
    "global_key": "metadata_tutorial",
    "metadata_fields": metadata_fields
  }

  dataset = client.create_dataset(name="Create data row with metadata")
  task = dataset.create_data_rows([data_row])
  task.wait_till_done()
  ```
</CodeGroup>

## Get metadata schema

<CodeGroup>
  ```python Python theme={null}
  # you can look up a schema by name.
  metadata_schema = metadata_ontology.get_by_name("tag")
  metadata_schema = metadata_ontology.get_by_name("enum_metadata_name")

  # check the schema

  print(metadata_schema)
  schema_id = metadata_schema.uid
  ```
</CodeGroup>

## Get metadata fields (ontology)

<CodeGroup>
  ```python Python theme={null}
  ## Fetch metadata schema ontology. A Labelbox workspace has a single metadata ontology.
  metadata_ontology = client.get_data_row_metadata_ontology()

  # List all available fields
  metadata_ontology.fields
  ```
</CodeGroup>

## Get metadata fields

<CodeGroup>
  ```python Python theme={null}
  datarow = next(dataset.data_rows())
  for metadata_field in datarow.metadata_fields:
    print(metadata_field['name'], ":", metadata_field['value'])
  ```
</CodeGroup>

Result:

<CodeGroup>
  ```json Output theme={null}
  tag : custom_tag
  split : train
  captureDateTime : 2023-04-04T15:24:37.229417Z
  ```
</CodeGroup>

## Bulk export data rows with metadata

<CodeGroup>
  ```python Python theme={null}
  export_params= {
    "performance_details": True,
    "label_details": True,
    "metadata_fields": True
  }

  export_task = dataset.export(params=export_params)
  export_task.wait_till_done()

  # Stream the export using a callback function

  def json_stream_handler(output: labelbox.BufferedJsonConverterOutput):
    print(output.json)

  export_task.get_buffered_stream(stream_type=labelbox.StreamType.RESULT).start(stream_handler=json_stream_handler)

  # Collect all exported data into a list

  export_json = [data_row.json for data_row in export_task.get_buffered_stream()]
  ```
</CodeGroup>

Result:

<CodeGroup>
  ```json JSON theme={null}
  {
    "data_row": {
      "id": "clflxqzty07fj077qa3dd4v27",
      "global_key": "metadata_tutorial",
      "row_data": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/basic.jpg"
    },
    "media_attributes": {
      "height": 1285,
      "width": 2258,
      "mime_type": "image/jpeg"
    },
    "metadata_fields": [{
      "schema_id": "cko8s9r5v0001h2dk9elqdidh",
      "schema_name": "tag",
      "value": "tag_string"
    }, {
      "schema_id": "cko8sbczn0002h2dkdaxb5kal",
      "schema_name": "split",
      "value": [{
        "schema_id": "cko8sc2yr0004h2dk69aj5x63",
        "schema_name": "valid"
      }, {
        "schema_id": "cko8scbz70005h2dkastwhgqt",
        "schema_name": "test"
      }]
    }, {
      "schema_id": "cko8sdzv70006h2dk8jg64zvb",
      "schema_name": "captureDateTime",
      "value": "2023-03-24T02:40:40.832576+00:00"
    }]
  }
  ```
</CodeGroup>

## Export metadata by data row ID

You can bulk export metadata by data row with the SDK.

<CodeGroup>
  ```python Python theme={null}
  data_row_ids = ['<data_row_id>']
  global_keys = ['<global_key>']

  #The data row identifiers methods (lb.DataRowIds and lb.GlobalKeys) validate whether the provided ID is a global key or a data row ID.
  #Additionally, they ensure that all IDs from the list provided are unique

  datarow_identifiers = lb.DataRowIds(data_row_ids)
  global_key_identifiers = lb.GlobalKeys(global_keys)

  # Use one of the identifiers

  mdo.bulk_export(data_row_ids=global_key_identifiers)

  # mdo.bulk_export(data_row_ids=datarow_identifiers)
  ```
</CodeGroup>

## Delete metadata fields from a data row

<CodeGroup>
  ```python Python theme={null}
  global_key = '<global_key>'
  schema_ids_to_delete =['<metadata_schema_id>']
  data_row_id = '<data_row_id>'

  deletions = [
      lb.DeleteDataRowMetadata(data_row_id=lb.GlobalKey(global_key), fields=schema_ids_to_delete)
      ]

  # Delete the specified metadata on the data row
  mdo.bulk_delete(deletes=deletions)
  ```
</CodeGroup>

## Upsert metadata to existing data rows

Labelbox supports individual or bulk metadata upsert of data rows. Metadata overwrites occur on a per-field basis.

<CodeGroup>
  ```python Python theme={null}
  tag_schema = metadata_ontology.get_by_name("tag")

  # Construct a string field

  field = DataRowMetadataField(
  schema_id=tag_schema.uid, # specify the schema id
  value="updated", # typed inputs
  )

  # Completed object ready for import

  metadata_payload = DataRowMetadata(
  global_key="<global key>", # optionally, set the argument to data_row_id to use a data row ID
  fields=[field]
  )

  # Provide a list of DataRowMetadata objects to upload

  metadata_ontology.bulk_upsert([metadata_payload])
  ```
</CodeGroup>

## Update metadata schema

You can update any custom metadata schema's name. However, the type cannot be modified. You also cannot modify the names of reserved fields.

<CodeGroup>
  ```python Python theme={null}
  # update a metadata schema's name
  metadata_schema = metadata_ontology.update_schema(name="metadata_name", new_name="metadata_name_updated")

  # Enum metadata schema is a bit different since it contains options.
  # create an Enum metadata with options
  enum_schema = metadata_ontology.create_schema(name="enum_metadata_name", kind=DataRowMetadataKind.enum,
                                              options=["option 1", "option 2"])

  # update an Enum metadata schema's name, similar to other metadata schema types
  enum_schema = metadata_ontology.update_schema(name="enum_metadata_name", new_name="enum_metadata_name_updated")

  # update an Enum metadata schema option's name, this only applies to Enum metadata schema.
  enum_schema = metadata_ontology.update_enum_option(name="enum_metadata_name_option_updated", option="option 1",
                                                   new_option="option 3")
  ```
</CodeGroup>

## Delete metadata schema

You can delete a metadata schema by name.

<CodeGroup>
  ```python Python theme={null}
  status = metadata_ontology.delete_schema(name=metadata_schema.name)
  # returns True if successfully deleted
  ```
</CodeGroup>
