Audio

Guide for labeling audio data.

With the audio editor, you can add annotations to audio files, like classifying natural language conversations and music, to train conversational AI and audio-based ML models.

Set up audio annotation projects

To set up an audio annotation project:

  1. Create an audio dataset.
  2. On the Annotate projects page, click the + New project button.
  3. Select Audio. Provide a name and an optional description for your project.
  4. Click Save. The system then creates the project and redirects you to the project overview page.
  5. Click Add data. Then select your audio dataset. Click Sample to sample your dataset, or you can manually select data rows and click Queue batch.

📘

Data row size limit

To view the maximum size allowed for a data row, see limits.

Set up ontology

After setting up an audio annotation project, you can add an ontology based on how you want label the data. The audio editor supports the following ontology features that you can include in your ontology, along with payloads for importing and exporting audio annotations using the SDK:

FeatureImport annotationsExport annotations
Radio classificationSee payloadSee payload
Checklist classificationSee payloadSee payload
Free-form text classificationSee payloadSee payload

You can apply these classifications as global classifications at the file level, temporal classifications at the frame level, or nested classifications under other annotations.

Using the audio editor

After adding data and setting up an ontology for your audio annotation project, you can add labels to data rows using the audio editor. Each data row displays in the editor with:

  • A waveform visualizing pattern of sound pressure variation.
  • A spectrogram showing the range of sound frequencies and their strengths over time.
  • A timeline of audio split into 500 millisecond intervals by default and a Timeline Resolution slider that allows you to adjust the time intervals on the timeline.
  • Basic player controls, such as the play/pause button, back/forward 10-second buttons, and the playback speed. You can also click anywhere on the waveform to instantly move to your desired location.

To add a global classification. select the classification and enter the value.

To add a temporal classification, select the classification, choose the interval on the timeline or waveform for when the classification starts, and add the classification value. You will see a circle representing the classification value on the timeline.

📘

Timeline resolution differences for labels

If you set a lower resolution with the timeline resolution slider, the classification label you add may not align exactly with the current timeline resolution. This indicates that the classification was placed at a timestamp with a higher resolution than the one currently being used. You can adjust the timeline resolution to a higher resolution to see the exact position of the classification.

Keyboard shortcuts

FunctionHotkeyDescription
Play/PauseSpacePlay or pause the audio playback
Move backward one frameMove backward one frame
Move forward one frameMove forward one frame
Select framesShift + MouseSelect frames for adding temporal classifications
Advance to the previous keyframe + Advance to the previous keyframe
Advance to the next keyframe + Advance to the next keyframe
Jump to objectsDownJump to objects
Next objectDownMove to the next object
Previous objectUpMove to the previous object
Toggle + /Toggle the keyboard shortcuts menu