Model Run

A developer guide for creating and managing model training experiments.

A model run is a model training experiment within a Model directory. Each model run has its data snapshot (data rows, annotations, and data splits) versioned. You can upload predictions to a model run, and compare its performance against other Model Runs in the Model directory.

Get all model runs inside a Model

model_runs = model.model_runs()

Create a model run

Creates a model run belonging to this model.

model_run_name = "<your_model_run_name>"
example_config = {
    "learning_rate": 0.001, 
    "batch_size": 32, 
}
model_run = model.create_model_run(name=model_run_name, config=example_config)

Get model run

model_run_id = "<your_model_run_id>"
model_run = client.get_model_run(model_run_id=model_run_id)

model_run_data = model_run.model_run_data_rows()
model_run_config = model_run.get_config())

Add data rows to a model run

Add data rows to a model run without any associated labels. You can use either data_row_id or global_key to specify the data rows.

# Using data row ids 
data_row_ids = ["<data_row_id_1>", "<data_row_id_2>"]
model_run.upsert_data_rows(data_row_ids=data_row_ids)

# Using global keys 
global_keys = ["<global_key1>", "<global_key_2>"]
model_run.upsert_data_rows(global_keys=global_keys)

Assign data row training, validation, and test split

Note that assign_data_rows_to_split only works on data rows or labels that are already in a model run. You can assign them to one of "TRAINING", "VALIDATION", "TEST" split.

client.enable_experimental=True

dataset = client.get_dataset("<Dataset_id>") # Your training dataset 

# using data row ids 
model_run.assign_data_rows_to_split(
  data_row_ids=data_row_ids[:100],
  split="TRAINING",
)
model_run.assign_data_rows_to_split(
  data_row_ids=data_row_ids[100:150],
  split="VALIDATION",
)
model_run.assign_data_rows_to_split(
  data_row_ids=data_row_ids[150:200],
  split="TEST",
)

# using global keys 
model_run.assign_data_rows_to_split(
  global_keys=global_keys[:100],
  split="TRAINING",
)
model_run.assign_data_rows_to_split(
  global_keys=global_keys[100:150],
  split="VALIDATION",
)
model_run.assign_data_rows_to_split(
  global_keys=global_keys[150:200],
  split="TEST",
)


Upload custom metrics

If the auto-generated metrics are not sufficient for your use case, you can upload custom metrics to your model run. This will help you even more precisely evaluate your model performance in Labelbox.

Upload custom metrics to individual prediction annotations.

To upload custom metrics to individual predictions, you can append the following list of dictionaries to the respective prediction. The custom metric fields are supported for all annotation types except raster segmentation.

# Include the following list of dictionaries in your prediction annotation built with python classes
custom_metrics = [
  { 'name': 'iou', 'value': 0.5 },
  { 'name': 'f1', 'value': 0.33 },
  { 'name': 'precision', 'value': 0.55 },
  { 'name': 'recall', 'value': 0.33 },
  { 'name': 'tagsCount', 'value': 43 },
  { 'name': 'metric_with_a_very_long_name', 'value': 0.334332 }
]

# Include the following list of dictionaries in your prediction annotation built with ndjson
'customMetrics': [
  { 'name': 'iou', 'value': 0.5 },
  { 'name': 'f1', 'value': 0.33 },
  { 'name': 'precision', 'value': 0.55 },
  { 'name': 'recall', 'value': 0.33 },
  { 'name': 'tagsCount', 'value': 43 },
  { 'name': 'metric_with_a_very_long_name', 'value': 0.334332 }
]

Open this Colab for an interactive tutorial on how to import predictions with custom metrics in a model.

Scalar custom metrics

A ScalarMetric is a custom metric with a single scalar value. It can be uploaded at the following levels of granularity:
1. Data rows
2. Features
3. Nested features

from labelbox.data.annotation_types import (ScalarMetric,
                                            ScalarMetricAggregation,
                                            ConfusionMatrixMetric)
# custom metric on a data row 
data_row_metric = ScalarMetric(metric_name="iou_custom", value=0.5)

# custom metric on a feature
feature_metric = ScalarMetric(metric_name="iou_custom", feature_name="cat", value=0.5)

# custom metric on a nested feature
subclass_metric = ScalarMetric(metric_name="iou_custom",
                               feature_name="cat",
                               subclass_name="orange",
                               value=0.5)

Aggregation of custom metrics

This is an optional field on the ScalarMetric object, to control how custom metrics are aggergated. By default, the aggregation uses ARITHMETIC_MEAN.

Aggregations occur in the following cases:

  • When you provide a feature or nested-feature metric, Labelbox automatically aggregates the metric across features and nested-features on the data row.
    For example, say you provide a custom metric Bounding Box Width (BBW) on the features "cat" and "dog" . The data row-level metric for BBW is the average of these two values.
  • When you create slices, the custom metric is aggregated across data rows of the Slice.
  • When you filter data inside a Model Run, the custom metric is aggregated across the filtered data rows.
"""
If the following metrics are uploaded then
in the Labelbox App, users will see:
true positives dog = 4
true positives cat = 3
true positives = 7
"""

feature_metric = ScalarMetric(metric_name="true_positives",
                              feature_name="cat",
                              value=3,
                              aggregation=ScalarMetricAggregation.SUM)

feature_metric = ScalarMetric(metric_name="true_positives",
                              feature_name="dog",
                              value=4,
                              aggregation=ScalarMetricAggregation.SUM)

Add labels to a model run

Adds data rows and labels to a model run. By adding labels, the associated data rows will also be upserted to the model run.

# upsert using label ids 
label_ids = ["<label_id_1>","<label_id_2>", ...]
model_run.upsert_labels(label_ids)

Alternatively, you can add all labels from a project to a Model run directly. This will also add all data rows from that project to the model run.

# upsert using project id
model_run.upsert_labels(project_id=<project_id>)

Export labels from a Model Run

Export v2. See Export v2 for Model Runs for more details and export v2 JSON format.

# Set the export params to include/exclude certain fields. Make sure each of these fields are correctly grabbed 
export_params= {
    "attachments": True,
    "metadata_fields": True,
    "data_row_details": True,
}

export_task = model_run.export_v2(params=export_params)
export_task.wait_till_done()
print(export_task.errors)
export_json = export_task.result

Create, modify, and delete model run config to track your hyperparameters.

example_config = {
    "learning_rate": 0.001, 
    "checkpoint_path": "/path/to/checkpoint/file",
    "early_stopping": False,
    "batch_size": 32, 
    "optimizer": { 
      "adam": {
        "beta1": 0.899999976158,
        "beta2": 0.999000012875,
        "epsilon": 9.99999993923e-9
      }
    }, 
    "ngpu": 1, 
    }

model_run_1 = model.create_model_run(name="run 1", config=example_config)
# You can also create a model with config specified, see above.
# Here is how to create a model run first and update the model config field.
model_run_2 = model.create_model_run(name="run 2")
#The update will repace the previous model run config with the new json input.
model_run_2.update_config(example_config)

Get model run config

model_run_parameters = model_run.get_config()

Delete the model run config

model_run.reset_config()

Delete data rows from a model run

data_row_ids = ["<data_row_id_1>","<data_row_id_2>", ...]
model_run.delete_model_run_data_rows(data_row_ids=data_row_ids)

Delete model run

model_run.delete()