Model-assisted labeling allows you to upload a predicted label to a data row. It works very similar to Label, the major difference is that you will be using MALPredictionImport module instead of LabelImport module.

Create Model-assisted Labels with Annotation Types (Recommended)

When you bulk upload Model-assisted Labels via Annotation Type, you will create a Label list that contains a list of Labels, each of which is constructed by a Data (constructed from Data Row ids) and a list of Annotations.

Here are the Python annotation types that are supported for the Labels and Model-assisted Labels creation.

Python annotation type

Image

Video

Text

Tiled imagery

Bounding box

N/A

Polygon

N/A

Point

N/A

Polyline

N/A

Segmentation mask

N/A

Entity

N/A

N/A

N/A

Relationship

Radio

Checklist

Free-form text

Import relevant modules for your data type and annotation types

from labelbox import Client, MALPredictionImport
from labelbox.data.serialization import NDJsonConverter
# For working with images, videos, text and documents
from labelbox.data.annotation_types import (
    Label, ImageData, MaskData, LabelList, TextData, VideoData,
    ObjectAnnotation, ClassificationAnnotation, Polygon, Rectangle, Line, Mask,
    Point, Checklist, Radio, Text, TextEntity, ClassificationAnswer)

## For working with geospatial data
from labelbox.data.annotation_types.data.tiled_image import TiledBounds, TiledImageData, TileLayer, EPSG, EPSGTransformer

Create a Model-assisted Label and upload it to project

Here is a simple example of creating a model-assisted Label with an ImageData and an Annotation.

client = Client(api_key="<YOUR_API_KEY>")

# 1. Make sure the project has the right ontology for the Label's annotations.
# Here we will create a new project to show the ontology creation, you can also do it via the App.
project = client.create_project(name="test_label_import_project")
dataset = client.create_dataset(name="image_annotation_import_demo_dataset")
test_img_url = "https://raw.githubusercontent.com/Labelbox/labelbox-python/develop/examples/assets/2560px-Kitano_Street_Kobe01s5s4110.jpg"
data_row = dataset.create_data_row(row_data=test_img_url)
project.datasets.connect(dataset)
# Create ontology that matches the labels' annotation, in this example, we only need a bounding box.
ontology_builder = OntologyBuilder(tools=[
    Tool(tool=Tool.Type.BBOX, name="box")
])
ontology = client.create_ontology("bbox ontology", ontology_builder.asdict())
# Attach ontology to project
project.setup_editor(ontology)

# 2. Create annotation(s)
rectangle = Rectangle(start=Point(x=30,y=30), end=Point(x=200,y=200))
# Note this Annotation matches with the ontology's feature box by name
rectangle_annotation = ObjectAnnotation(value=rectangle, name="box")

# 3. Create a Label with a list of annotations associated with the data row.
annotations_list = [rectangle_annotation]
data = ImageData(uid = data_row.uid)
label = Label(data= data, annotations = annotations_list)

# 4. Upload the Label to project as Model assisted Label
label_list = LabelList()
label_list.append(label)
labels_ndjson = list(NDJsonConverter.serialize(label_list))
upload_job = MALPredictionImport.create_from_objects(
    client = client, 
    project_id = project.uid, 
    name="upload_mal_import_job", 
    labels=labels_ndjson)
print("Errors:", upload_job.errors)

Bulk import Labels

This example creates a bounding box label on each of the queued Data Rows in your project.

Configure the ontology for your project

Each Annotation of your Model-assisted Label must correspond to a Feature inside the ontology of your project. You can configure project ontology in the app, or via SDK.

Construct a LabelList

## Get a list of unlabeled Data Rows to import Labels
project = client.get_project("<YOUR_PROJECT_ID>")
queued_data_rows = project.export_queued_data_rows()

label_list = LabelList()

for datarow in queued_data_rows:
  annotations_list = []
  ## replace this with your own function
  ground_truth_label = get_ground_truth_function(datarow)
  
  for annotation in ground_truth_label:
    # Specify annotation class name. This should be exact match of a feature name in ontology
    class_name = annotation.class_name
    bbox = annotation.bbox

    # Create an annotation type
    annotations_list.append(ObjectAnnotation(
        name = class_name,
        value = Rectangle.from_xyhw(*bbox),
    ))
  
  # Create a label type with data type and annotation types
  data = ImageData(uid = datarow['id'])
  label_list.append(Label(data = data, annotations = annotations_list))

Convert label list to NDJSON for import

To import model assisted Labels in Labelbox, you will need to convert the python types to NDJSON format. The NDJSON format is used as a normalized interface to connect Python SDK or any other external method and Labelbox backend service.

labels_ndjson = list(NDJsonConverter.serialize(label_list))

upload_job = MALPredictionImport.create_from_objects(
    client = client, 
    project_id = project.uid, 
    name="upload_label_import_job", 
    labels=labels_ndjson)

print("Errors:", upload_job.errors)

Option 2: Create Labels with NDJSON

Alternatively, you can create and upload MAL with NDJSON. Here are the NDJSON supported Annotation kinds for the Labels and Model-assisted Labels creation.

Annotation

Image

Video

Text

Audio

Document

Tiled imagery

Bounding box

N/A

N/A

Polygon

N/A

N/A

N/A

Point

N/A

N/A

N/A

Polyline

N/A

N/A

N/A

Segmentation mask

N/A

N/A

N/A

Entity

N/A

N/A

N/A

coming soon

N/A

Relationship

N/A

coming soon

Radio

Checklist

Free-form text

Check out this tutorial notebook for an example of video MAL import via NDJSON. Open In ColabOpen In Colab

Video

import uuid
from labelbox import Client, MALPredictionImport, OntologyBuilder, Option, Classification

client = Client()

project = client.create_project(name = "video-frame-based-classifications-project")
dataset = client.create_dataset(name = 'video-frame-based-classifications-dataset')
data_row = dataset.create_data_row(row_data = "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ElephantsDream.mp4")
project.datasets.connect(dataset)

ontology_builder = OntologyBuilder(
    classifications = [
        Classification(class_type = Classification.Type.RADIO, scope = Classification.Scope.INDEX, instructions = "radio_classification", options = [
            Option(value = "radio_option_1"), Option(value = "radio_option_2")]

        ),
        Classification(class_type = Classification.Type.CHECKLIST, scope = Classification.Scope.INDEX, instructions = "checklist_classification", options = [
            Option("checklist_option_1"), Option("checklist_option_2")]
        )]

)
ontology = client.create_ontology("video-frame-based-classification-ontology", ontology_builder.asdict())
project.setup_editor(ontology)

schema_id_lookup = {}
for classification in ontology.classifications():
    options = {}
    for option in classification.options:
        options[option.value] = option.feature_schema_id
    schema_id_lookup[classification.instructions] = {'schema_id' : classification.feature_schema_id, 'options' : options}


radio_annotation = {
   "schemaId":  schema_id_lookup['radio_classification']['schema_id'],
   "uuid": str(uuid.uuid4()),
   "dataRow": {
       "id": data_row.uid
    },
    "answer": [
        {"schemaId": schema_id_lookup['radio_classification']['options']['radio_option_1'], "frames" : [{"start": 7, "end": 13}, { "start": 19,"end": 20}]},
        {"schemaId": schema_id_lookup['radio_classification']['options']['radio_option_2'], "frames" : [{"start": 14, "end": 18}]}
        ]
}

checklist_annotation = {
    "schemaId":  schema_id_lookup['checklist_classification']['schema_id'],
    "uuid": str(uuid.uuid4()),
    "dataRow": {
        "id": data_row.uid
     },
     "answer": [
         {"schemaId": schema_id_lookup['checklist_classification']['options']['checklist_option_1'], "frames" : [{"start": 7, "end": 13}, { "start": 18,"end": 19}]},
         {"schemaId": schema_id_lookup['checklist_classification']['options']['checklist_option_2'], "frames" : [{"start": 1, "end": 18}]}
     ]
}

annotations = [radio_annotation, checklist_annotation]
job = MALPredictionImport.create_from_objects(
            client, project.uid, str(uuid.uuid4()), annotations)
print(job.errors)
from labelbox import Client, LabelingFrontend
from labelbox.schema.ontology import OntologyBuilder, Tool, Classification, Option

API_KEY = None
client = Client(api_key=API_KEY)

ontology_builder = OntologyBuilder(
    tools=[Tool(tool=Tool.Type.BBOX, name="jellyfish")])

dataset = client.create_dataset(name="video_mal_dataset")
dataset.create_data_row(
    row_data=
    "https://storage.labelbox.com/cjhfn5y6s0pk507024nz1ocys%2Fb8837f3b-b071-98d9-645e-2e2c0302393b-jellyfish2-100-110.mp4"
)
project.setup_editor(ontology_builder.asdict())
project.datasets.connect(dataset)
ontology = ontology_builder.from_project(project)
# We want all of the feature schemas to be easily accessible by name.
schema_lookup = {tool.name: tool.feature_schema_id for tool in ontology.tools}
print(schema_lookup)

segments = [{
    "keyframes": [{
        "frame": 1,
        "bbox": {
            "top": 80,
            "left": 80,
            "height": 80,
            "width": 80
        }
    }, {
        "frame": 20,
        "bbox": {
            "top": 125,
            "left": 125,
            "height": 200,
            "width": 300
        }
    }]
}, {
    "keyframes": [{
        "frame": 27,
        "bbox": {
            "top": 80,
            "left": 50,
            "height": 80,
            "width": 50
        }
    }]
}]

def create_video_bbox_ndjson(datarow_id: str, schema_id: str,
                             segments: Dict[str, Any]) -> Dict[str, Any]:
    return {
        "uuid": str(uuid.uuid4()),
        "schemaId": schema_id,
        "dataRow": {
            "id": datarow_id
        },
        "segments": segments
    }
uploads = []

for data_row in dataset.data_rows():
    uploads.append(
        create_video_bbox_ndjson(data_row.uid, schema_lookup['jellyfish'],
                                 segments))
upload_task = project.upload_annotations(name=f"upload-job-{uuid.uuid4()}",
                                         annotations=uploads,
                                         validate=False)
# Wait for upload to finish (Will take up to five minutes)
upload_task.wait_until_done()
# Review the upload status
print(upload_task.errors)

Document

# MAL for bounding boxes in Documents
annotations = []

for row in project.export_queued_data_rows():
    print("row: ",row['id'], row['externalId'])
    annotations.append({
        "uuid": "a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a12",
        "name": "box",
        "dataRow": {"id": row['id']},
        "bbox": {"top": 50.0, "left": 200.7, "height": 150.8, "width": 200.0},
        "unit": "POINTS",
        "page": 4
    })

import_annotations = MALPredictionImport.create_from_objects(client=client, project_id = project.uid, name=f"import {str(uuid.uuid4())}", predictions=annotations)
import_annotations.wait_until_done()