Labelbox has three ways of representing annotations for importing and exporting data:
- Python types (import and export)
- JSON (only for exports)
- New Line Delimited JSON (NDJSON) (only for imports)
Labelbox python types
Labelbox is developing a python typed format for representing human and machine-generated annotations. It is designed to standardize and simplify your machine learning import/export workflows.
Here are a few advantages of using python types:
- Validation upon import
- Easy format conversion (e.g. MS COCO)
- Easy visualization
- Native support for importing ground truth and model predictions into Labelbox
Design
- A
LabelCollection
is a list or generator for working with a collection ofLabels
- A
Label
is constructed fromData
andAnnotations
. E.g. an image and bounding boxes - An
Annotation
is either anObjectAnnotation
or aClassificationAnnotation
. Annotations
have a name and aGeometry
,Classification
, or someText
data.
Installation
Labelbox python types are part of data
extra of the python SDK.
pip install "labelbox[data]"
Import
# For working with images, videos, text and documents
from labelbox.data.annotation_types import (
Label, ImageData, MaskData, LabelList, TextData, VideoData,
ObjectAnnotation, ClassificationAnnotation, Polygon, Rectangle, Line, Mask,
Point, Checklist, Radio, Text, TextEntity, ClassificationAnswer)
## For working with geospatial data
from labelbox.data.annotation_types.data.tiled_image import TiledBounds, TiledImageData, TileLayer, EPSG, EPSGTransformer
ObjectAnnotation
An object annotation is a type of a feature category of supported object kinds including nested classifications.
#annotation_kind can be a ObjectAnnotation kind such as Rectangle, Polygon, Point, etc...
annotation = ObjectAnnotation(value=annotation_kind,name="Feature name")
Parameter | Description |
---|---|
name | Feature name |
value | Annotation kinds |
feature_schema_id | Optionally specify Features ID. |
ClassificationAnnotation | Optionally specify nested ClassificationAnnotation |
ClassificationAnnotation
A classification annotation is a type of a feature category of supported object kinds including nested classifications.
# annotation_kind can be a ClassificationAnnotation kind such as Radio, Checklist, Text, etc...
annotation = ClassificationAnnotation(value=annotation_kind,name="Feature name")
Parameter | Description |
---|---|
name | Classification feature name |
value | - text string to import free form text - ClassificationAnnotation for checklist / radio. This can be deeply nested as defined in your ontology |
Supported annotation kinds
Annotation kinds are the most atomic type of Features. You can construct a feature within an ontology by using annotation kinds.
Annotation kind | Feature category | Description |
---|---|---|
Mask | ObjectAnnotation | Segmentation mask in raster format |
Polygon | ObjectAnnotation | List of unique points forming a closed polygon |
Rectangle | ObjectAnnotation | Bounding box |
Line | ObjectAnnotation | A list of unique points forming a poly line |
Point | ObjectAnnotation | A single point |
Checklist | ClassificationAnnotation | Multi-choice checklist |
Radio | ClassificationAnnotation | Single-choice radio |
Text | ClassificationAnnotation | Free form text |
Creating labels and annotations
Below is an example that creates a typed bounding box labels on queued Data Rows in your project.
## Get a list of unlabeled Data Rows to import ground truth annotations
project = client.get_project("<project-id>")
queued_data_rows = project.export_queued_data_rows()
ground_truth_list = LabelList()
for datarow in queued_data_rows:
annotations_list = []
## replace this with your own function
ground_truth_label = get_ground_truth_function(datarow)
id = datarow['externalId']
for annotation in ground_truth_label:
# Specify annotation class name. This should be exact match of a feature name in ontology
class_name = annotation.class_name
bbox = annotation.bbox
# Create an annotation type
annotations_list.append(ObjectAnnotation(
name = class_name,
value = Rectangle.from_xyhw(*bbox),
))
# Create a label type with data type and annotation types
data = ImageData(uid = datarow['id'])
ground_truth_list.append(Label(data = data, annotations = annotations_list))
Assigning feature schema IDs
To import annotations, Labelbox must know how to assign the annotation to an existing feature. There are two ways you can assign a feature schema ID to an annotation.
Use assign_feature_schema_ids
to automatically assign features
assign_feature_schema_ids
to automatically assign featuresContinuing the example from above, you can use assign_feature_schema_ids
method of LabelList
. You must specify the ontology to lookup and assign feature schema IDs within the specified ontology.
#Various methods to get ontology
ontology_object = project.ontology()
ontology_object = client.get_ontology(ONTOLOGY_ID)
ontology = OntologyBuilder.from_ontology(ontology)
ontology = OntologyBuilder.from_project(project)
#Assign feature schema IDs to the label list annotations
ground_truth_list.assign_feature_schema_ids(ontology)
Specify feature_schema_id
at the time of creating an annotation
feature_schema_id
at the time of creating an annotationLabel(
data=ImageData(uid = datarow['id'])),
annotations=[
ObjectAnnotation(
value=Rectangle.from_xyhw(34, 153, 204, 67),
feature_schema_id="ck67grts29n7x0890atmeiahw"
)
]
)
Convert label list to NDJSON for import
To import annotations in Labelbox, you must convert the python types to NDJSON format. You can do so by using NDJSONConverter as shown below.
ground_truth_ndjson = list(NDJsonConverter.serialize(ground_truth_list))
NDJSON
The NDJSON format is used as a normalized interface to connect Python SDK or any other external method and Labelbox backend service.
For most annotation kinds, you will not need to use the NDJSON format at all. Usually, annotation kinds that are in beta are first available for import in the NDJSON format.
JSON
You can export data from Labelbox projects in JSON format.