Overview

Labelbox has three ways of representing annotations for importing and exporting data:

  • Python types (import and export)
  • JSON (only for exports)
  • New Line Delimited JSON (NDJSON) (only for imports)

Labelbox python types

Labelbox is developing a python typed format for representing human and machine-generated annotations. It is designed to standardize and simplify your machine learning import/export workflows.

Here are a few advantages of using python types:

  • Validation upon import
  • Easy format conversion (e.g. MS COCO)
  • Easy visualization
  • Native support for importing ground truth and model predictions into Labelbox

Design

  • A LabelCollection is a list or generator for working with a collection of Labels
  • A Label is constructed from Data and Annotations. E.g. an image and bounding boxes
  • An Annotation is either an ObjectAnnotation or a ClassificationAnnotation.
  • Annotations have a name and a Geometry, Classification, or some Text data.
1657

Installation

Labelbox python types are part of data extra of the python SDK.

pip install "labelbox[data]"

Import

# For working with images, videos, text and documents
from labelbox.data.annotation_types import (
    Label, ImageData, MaskData, LabelList, TextData, VideoData,
    ObjectAnnotation, ClassificationAnnotation, Polygon, Rectangle, Line, Mask,
    Point, Checklist, Radio, Text, TextEntity, ClassificationAnswer)

## For working with geospatial data
from labelbox.data.annotation_types.data.tiled_image import TiledBounds, TiledImageData, TileLayer, EPSG, EPSGTransformer

ObjectAnnotation

An object annotation is a type of a feature category of supported object kinds including nested classifications.

#annotation_kind can be a ObjectAnnotation kind such as Rectangle, Polygon, Point, etc...
 annotation = ObjectAnnotation(value=annotation_kind,name="Feature name")
ParameterDescription
nameFeature name
valueAnnotation kinds
feature_schema_idOptionally specify Features ID.
ClassificationAnnotationOptionally specify nested ClassificationAnnotation
1099

ClassificationAnnotation

A classification annotation is a type of a feature category of supported object kinds including nested classifications.

# annotation_kind can be a ClassificationAnnotation kind such as Radio, Checklist, Text, etc...
annotation = ClassificationAnnotation(value=annotation_kind,name="Feature name")
ParameterDescription
nameClassification feature name
value- text string to import free form text
- ClassificationAnnotation for checklist / radio. This can be deeply nested as defined in your ontology
1099

Supported annotation kinds

Annotation kinds are the most atomic type of Features. You can construct a feature within an ontology by using annotation kinds.

Annotation kindFeature categoryDescription
MaskObjectAnnotationSegmentation mask in raster format
PolygonObjectAnnotationList of unique points forming a closed polygon
RectangleObjectAnnotationBounding box
LineObjectAnnotationA list of unique points forming a poly line
PointObjectAnnotationA single point
ChecklistClassificationAnnotationMulti-choice checklist
RadioClassificationAnnotationSingle-choice radio
TextClassificationAnnotationFree form text

Creating labels and annotations

Below is an example that creates a typed bounding box labels on queued Data Rows in your project.

## Get a list of unlabeled Data Rows to import ground truth annotations
project = client.get_project("<project-id>")
queued_data_rows = project.export_queued_data_rows()

ground_truth_list = LabelList()

for datarow in queued_data_rows:
  annotations_list = []
  ## replace this with your own function
  ground_truth_label = get_ground_truth_function(datarow)
  id = datarow['externalId']
  
  for annotation in ground_truth_label:
    # Specify annotation class name. This should be exact match of a feature name in ontology
    class_name = annotation.class_name
    bbox = annotation.bbox

    # Create an annotation type
    annotations_list.append(ObjectAnnotation(
        name = class_name,
        value = Rectangle.from_xyhw(*bbox),
    ))
  
  # Create a label type with data type and annotation types
  data = ImageData(uid = datarow['id'])
  ground_truth_list.append(Label(data = data, annotations = annotations_list))

Assigning feature schema IDs

To import annotations, Labelbox must know how to assign the annotation to an existing feature. There are two ways you can assign a feature schema ID to an annotation.

Use assign_feature_schema_ids to automatically assign features

Continuing the example from above, you can use assign_feature_schema_ids method of LabelList. You must specify the ontology to lookup and assign feature schema IDs within the specified ontology.

#Various methods to get ontology
ontology_object = project.ontology()
ontology_object = client.get_ontology(ONTOLOGY_ID)
ontology = OntologyBuilder.from_ontology(ontology)
ontology = OntologyBuilder.from_project(project)

#Assign feature schema IDs to the label list annotations
ground_truth_list.assign_feature_schema_ids(ontology)

Specify feature_schema_id at the time of creating an annotation

Label(
    data=ImageData(uid = datarow['id'])),
    annotations=[
        ObjectAnnotation(
            value=Rectangle.from_xyhw(34, 153, 204, 67),
            feature_schema_id="ck67grts29n7x0890atmeiahw"
        )
    ]
)

Convert label list to NDJSON for import

To import annotations in Labelbox, you must convert the python types to NDJSON format. You can do so by using NDJSONConverter as shown below.

ground_truth_ndjson = list(NDJsonConverter.serialize(ground_truth_list))

NDJSON

The NDJSON format is used as a normalized interface to connect Python SDK or any other external method and Labelbox backend service.

For most annotation kinds, you will not need to use the NDJSON format at all. Usually, annotation kinds that are in beta are first available for import in the NDJSON format.

JSON

You can export data from Labelbox projects in JSON format.