> ## Documentation Index
> Fetch the complete documentation index at: https://docs.labelbox.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Import audio annotations

> Developer guide for importing annotations on audio data and sample import formats.

<CardGroup cols={2}>
  <Card title="Open In Colab" icon="infinity" iconType="solid" href="https://colab.research.google.com/github/Labelbox/labelbox-notebooks/blob/main/annotation_import/audio.ipynb" horizontal />

  <Card title="GitHub" icon="github" iconType="solid" href="https://github.com/Labelbox/labelbox-notebooks/blob/main/annotation_import/audio.ipynb" horizontal />
</CardGroup>

## Overview

To import annotations in Labelbox, you need to create an annotations payload. In this section, we provide this payload for every supported annotation type.

### Annotation payload types

Labelbox supports two formats for the annotations payload:

* Python annotation types (recommended)

  * Provides a seamless transition between third-party platforms, machine learning pipelines, and Labelbox.
  * Allows you to build annotations locally with local file paths, numpy arrays, or URLs
  * Easily convert Python Annotation Type format to NDJSON format to quickly import annotations to Labelbox
  * Supports one-level nested classification (radio, checklist, or free-form text) under a tool or classification annotation.

* JSON

  * Skips formatting annotation payload in the Labelbox Python annotation type
  * Supports any levels of nested classification (radio, checklist, or free-form text) under a tool or classification annotation.

### Label import types

Labelbox additionally supports two types of label imports:

* [Model-assisted labeling (MAL)](/docs/model-assisted-labeling)
  * This workflow allows you to import computer-generated predictions (or simply annotations created outside of Labelbox) as pre-labels on an asset.
* [Ground truth](https://www.google.com/url?q=https%3A%2F%2Fdocs.labelbox.com%2Fdocs%2Fimport-ground-truth)
  * This workflow functionality allows you to bulk import your ground truth annotations from an external or third-party labeling system into Labelbox *Annotate*. Using the label import API to import external data is a useful way to consolidate and migrate all annotations into Labelbox as a single source of truth.

## Supported Annotations

The following annotations are supported for an audio data row:

* Radio
* Checklist
* Free-form text
* Temporal classifications (milliseconds): text, radio, checklist

## Classifications

### Radio (single choice)

<CodeGroup>
  ```python Python annotation theme={null}
  radio_annotation = lb_types.ClassificationAnnotation(
      name="radio_question",
      value=lb_types.Radio(answer =
          lb_types.ClassificationAnswer(name = "first_radio_answer")
      )
  )
  ```

  ```python NDJSON theme={null}
  radio_annotation_ndjson = {
    "name": "radio_question",
    "answer": {"name": "first_radio_answer"}
  }
  ```
</CodeGroup>

### Checklist (multiple choice)

<CodeGroup>
  ```python Python annotation theme={null}
  checklist_annotation = lb_types.ClassificationAnnotation(
      name="checklist_question",
      value=lb_types.Checklist(answer = [
          lb_types.ClassificationAnswer(name = "first_checklist_answer"),
          lb_types.ClassificationAnswer(name = "second_checklist_answer"),
          lb_types.ClassificationAnswer(name = "third_checklist_answer")
      ])
    )
  ```

  ```json NDJSON theme={null}
  checklist_annotation_ndjson = {
    "name": "checklist_question",
    "answer": [
      {"name": "first_checklist_answer"},
      {"name": "second_checklist_answer"},
      {"name": "third_checklist_answer"},
    ]
  }
  ```
</CodeGroup>

### Free-form text

<CodeGroup>
  ```python Python annotation theme={null}
  text_annotation = lb_types.ClassificationAnnotation(
      name = "free_text",
      value = lb_types.Text(answer="sample text")
  )
  ```

  ```json NDJSON theme={null}
  text_annotation_ndjson = {
    "name": "free_text",
    "answer": "sample text",
  }
  ```
</CodeGroup>

### Nested classifications

<CodeGroup>
  ```python Python annotation theme={null}
  nested_radio_annotation = lb_types.ClassificationAnnotation(
    name="nested_radio_question",
    value=lb_types.Radio(
      answer=lb_types.ClassificationAnswer(
        name="first_radio_answer",
        classifications=[
          lb_types.ClassificationAnnotation(
            name="sub_radio_question",
            value=lb_types.Radio(
              answer=lb_types.ClassificationAnswer(
                name="first_sub_radio_answer"
              )
            )
          )
        ]
      )
    )
  )

  nested_checklist_annotation = lb_types.ClassificationAnnotation(
    name="nested_checklist_question",
    value=lb_types.Checklist(
      answer=[lb_types.ClassificationAnswer(
        name="first_checklist_answer",
        classifications=[
          lb_types.ClassificationAnnotation(
            name="sub_checklist_question",
            value=lb_types.Checklist(
              answer=[lb_types.ClassificationAnswer(
              name="first_sub_checklist_answer"
            )]
          ))
        ]
      )]
    )
  )
  ```

  ```json NDJSON theme={null}
  nested_radio_annotation_ndjson= {
    "name": "nested_radio_question",
    "answer": {
        "name": "first_radio_answer",
        "classifications": [
          {
            "name": "sub_radio_question",
            "answer": {"name": "first_sub_radio_answer"}
          }
        ]
      }
  }

  nested_checklist_annotation_ndjson = {
    "name": "nested_checklist_question",
    "answer": [{
        "name": "first_checklist_answer",
        "classifications" : [
          {
            "name": "sub_checklist_question",
            "answer": {"name": "first_sub_checklist_answer"}
          }
        ]
    }]
  }
  ```
</CodeGroup>

## Temporal classifications (milliseconds)

Use temporal classifications to attach **time-based** classification labels to an audio asset. Temporal ranges are represented in **milliseconds** from the start of the asset.

When using NDJSON, temporal classifications use a **unified** `answer: [...]` list structure:

* For **temporal radio**, `answer` is a **single-item list** (unlike global radio, which uses an object).
* For **temporal checklist**, `answer` is a **multi-item list**.
* For **temporal text**, `answer` is a list of `{ "value": <text>, "frames": [{"start": <ms>, "end": <ms>}] }`.

Temporal classifications support nested hierarchies (for example: Text > Text > Text, or Radio > Radio > Radio).

### Free-form text (temporal)

<CodeGroup>
  ```python Python annotation theme={null}
  temporal_text_annotation = lb_types.TemporalClassificationText(
    name="transcription",
    value=[
      (1000, 1100, "Hello"),
      (1500, 2400, "How can I help you?"),
    ],
  )
  ```

  ```json NDJSON theme={null}
  temporal_text_annotation_ndjson = {
    "name": "transcription",
    "answer": [
      { "value": "Hello", "frames": [{"start": 1000, "end": 1100}] },
      { "value": "How can I help you?", "frames": [{"start": 1500, "end": 2400}] }
    ]
  }
  ```
</CodeGroup>

### Radio (temporal)

<CodeGroup>
  ```python Python annotation theme={null}
  temporal_radio_annotation = lb_types.TemporalClassificationQuestion(
    name="speaker",
    value=[
      lb_types.TemporalClassificationAnswer(
        name="user",
        frames=[(200, 1600)],
      )
    ],
  )
  ```

  ```json NDJSON theme={null}
  temporal_radio_annotation_ndjson = {
    "name": "speaker",
    "answer": [
      { "name": "user", "frames": [{"start": 200, "end": 1600}] }
    ]
  }
  ```
</CodeGroup>

### Checklist (temporal)

<CodeGroup>
  ```python Python annotation theme={null}
  temporal_checklist_annotation = lb_types.TemporalClassificationQuestion(
    name="audio_quality",
    value=[
      lb_types.TemporalClassificationAnswer(
        name="background_noise",
        frames=[(0, 1500), (2000, 3000)],
      ),
      lb_types.TemporalClassificationAnswer(
        name="echo",
        frames=[(2200, 2900)],
      ),
    ],
  )
  ```

  ```json NDJSON theme={null}
  temporal_checklist_annotation_ndjson = {
    "name": "audio_quality",
    "answer": [
      {
        "name": "background_noise",
        "frames": [{"start": 0, "end": 1500}, {"start": 2000, "end": 3000}]
      },
      {
        "name": "echo",
        "frames": [{"start": 2200, "end": 2900}]
      }
    ]
  }
  ```
</CodeGroup>

### Nested classifications (temporal)

<CodeGroup>
  ```python Python annotation theme={null}
  nested_temporal_radio_annotation = lb_types.TemporalClassificationQuestion(
    name="speaker",
    value=[
      lb_types.TemporalClassificationAnswer(
        name="user",
        frames=[(200, 1600)],
        classifications=[
          lb_types.TemporalClassificationQuestion(
            name="tone",
            value=[
              lb_types.TemporalClassificationAnswer(
                name="professional",
                frames=[(1000, 1600)],
                classifications=[
                  lb_types.TemporalClassificationQuestion(
                    name="clarity",
                    value=[
                      lb_types.TemporalClassificationAnswer(
                        name="clear",
                        frames=[(1300, 1600)],
                      )
                    ],
                  )
                ],
              )
            ],
          )
        ],
      )
    ],
  )
  ```

  ```json NDJSON theme={null}
  nested_temporal_radio_annotation_ndjson = {
    "name": "speaker",
    "answer": [
      {
        "name": "user",
        "frames": [{"start": 200, "end": 1600}],
        "classifications": [
          {
            "name": "tone",
            "answer": [
              {
                "name": "professional",
                "frames": [{"start": 1000, "end": 1600}],
                "classifications": [
                  {
                    "name": "clarity",
                    "answer": [
                      { "name": "clear", "frames": [{"start": 1300, "end": 1600}] }
                    ]
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
  ```
</CodeGroup>

## Example: Import pre-labels or ground truths

The steps to import annotations as pre-labels (machine-assisted learning) are similar to those to import annotations as ground truth labels. However, they vary slightly, and we will describe the differences for each scenario.

### Before you start

The below imports are needed to use the code examples in this section.

<CodeGroup>
  ```python Python theme={null}
  import labelbox as lb
  import uuid
  import labelbox.types as lb_types
  ```
</CodeGroup>

Replace the value of `API_KEY` with a valid [API key](/reference/create-api-key) to connect to the Labelbox client.

<CodeGroup>
  ```python Python theme={null}
  API_KEY = None
  client = lb.Client(API_KEY)
  ```
</CodeGroup>

### Step 1: Import data rows

Data rows must first be uploaded to **Catalog** to attach annotations.

This example shows how to create a data row in **Catalog** by attaching it to a [dataset](/reference/dataset) .

<CodeGroup>
  ```python Python theme={null}
  global_key = "sample-audio-1.mp3"

  asset = {
    "row_data": "https://storage.googleapis.com/labelbox-datasets/audio-sample-data/sample-audio-1.mp3",
    "global_key": global_key
  }

  dataset = client.create_dataset(name="audio_annotation_import_demo_dataset")
  task = dataset.create_data_rows([asset])

  task.wait_till_done()

  print("Errors:", task.errors)
  print("Failed data rows: ", task.failed_data_rows)

  ```
</CodeGroup>

### Step 2: Set up ontology

Your project ontology should support the tools and classifications required by your annotations. To ensure accurate schema feature mapping, the value used as the `name` parameter should match the value of the `name` field in your annotation.

For example, when we created an annotation above, we provided a name`annotation_name`. Now, when we set up our ontology, we must ensure that the name of our bounding box tool is also `anotations_name`. The same alignment must hold true for the other tools and classifications we create in our ontology.

This example shows how to create an ontology containing all supported [annotation types](#supported-annotations) .

<CodeGroup>
  ```python Python theme={null}
  ontology_builder = lb.OntologyBuilder(
    classifications=[
      lb.Classification(
        class_type=lb.Classification.Type.TEXT,
        name="text_audio"),
      lb.Classification(
        class_type=lb.Classification.Type.CHECKLIST,
        name="checklist_audio",
        options=[
          lb.Option(value="first_checklist_answer"),
          lb.Option(value="second_checklist_answer")
        ]
      ),
      lb.Classification(
        class_type=lb.Classification.Type.RADIO,
        name="radio_audio",
        options=[
          lb.Option(value="first_radio_answer"),
          lb.Option(value="second_radio_answer")
        ]
      ),
      # Temporal classifications for audio require INDEX scope
      lb.Classification(
        class_type=lb.Classification.Type.TEXT,
        name="transcription",
        scope=lb.Classification.Scope.INDEX
      ),
    ]
  )

  ontology = client.create_ontology("Ontology Audio Annotations",
                                    ontology_builder.asdict(),
                                    media_type=lb.MediaType.Audio)
  ```
</CodeGroup>

### Step 3: Set Up a Labeling Project

<CodeGroup>
  ```python Python theme={null}
  # Create Labelbox project
  project = client.create_project(name="audio_project",
                                      media_type=lb.MediaType.Audio)

  # Setup your ontology
  project.connect_ontology(ontology)
  ```
</CodeGroup>

### Step 4: Send Data Rows to Project

<CodeGroup>
  ```python Python theme={null}
  # Create a batch to send to your MAL project
  batch = project.create_batch(
    "first-batch-audio-demo", # Each batch in a project must have a unique name
    global_keys=[global_key], # Paginated collection of data row objects, list of data row ids or global keys
    priority=5 # priority between 1(Highest) - 5(lowest)
  )

  print("Batch: ", batch)
  ```
</CodeGroup>

### Step 5: Create annotation payloads

For help understanding annotation payloads, see [overview](#overview). To declare payloads, you can use Python annotation types (*preferred*) or NDJSON objects. For annotations that you want to import as ground truth labels, you can also specify [benchmarks](/docs/benchmark) using the `is_benchmark_reference` flag.

These examples demonstrate each format and how to compose annotations into labels attached to data rows.

<CodeGroup>
  ```python Python Annotation Payload theme={null}
  label = []
  label.append(
    lb_types.Label(
      data={"global_key" : global_key },
      annotations=[
        text_annotation,
        checklist_annotation,
        radio_annotation,
        temporal_text_annotation,
        temporal_radio_annotation,
        temporal_checklist_annotation
      ],
      # Optional: set the label as a benchmark
      # Only supported for groud truth imports
      is_benchmark_reference = True
    )
  )
  ```

  ```python NDJSON Payload theme={null}
  label_ndjson = []
  for annotations in [text_annotation_ndjson,
                      checklist_annotation_ndjson,
                      radio_annotation_ndjson,
                      temporal_text_annotation_ndjson,
                      temporal_radio_annotation_ndjson,
                      temporal_checklist_annotation_ndjson]:
    annotations.update({
        'dataRow': {
            'globalKey': global_key
        }
    })
    label_ndjson.append(annotations)
  ```
</CodeGroup>

### Step 6: Import annotation payload

For prelabeled (model-assisted labeling) scenarios, pass your payload as the value of the `predictions` parameter. For ground truths, pass the payload to the `labels` parameter.

#### Option A: Upload as [prelabels (model assisted labeling)](/docs/model-assisted-labeling)

This option is helpful for speeding up the initial labeling process and reducing the manual labeling workload for high-volume datasets.

<CodeGroup>
  ```python MAL import theme={null}
  # Upload MAL label for this data row in project
  upload_job = lb.MALPredictionImport.create_from_objects(
      client = client,
      project_id = project.uid,
      name="mal_job"+str(uuid.uuid4()),
      predictions=label
  )

  print(f"Errors: {upload_job.errors}", )
  print(f"Status of uploads: {upload_job.statuses}")
  ```
</CodeGroup>

#### Option B: Upload to a labeling project as [ground truth](/docs/import-ground-truth)

This option is helpful for loading high-confidence labels from another platform or previous projects that just need review rather than manual labeling effort.

<CodeGroup>
  ```python Label import theme={null}
  # Upload label for this data row in project
  upload_job = lb.LabelImport.create_from_objects(
      client = client,
      project_id = project.uid,
      name="label_import_job"+str(uuid.uuid4()),
      labels=label
  )

  print(f"Errors: {upload_job.errors}", )
  print(f"Status of uploads: {upload_job.statuses}")
  ```
</CodeGroup>
