Experiments

A developer guide for creating and managing model experiments.

Model

An experiment is a container in Labelbox that houses all of the information related to the iterative development of a specific model. It contains the data rows for training, model error analysis metrics, model versioning, and versioned snapshots (model runs) of data rows, predictions, etc, associated with a model’s development.

Experiments are designed to help you track and compare all of the iterations associated with your model development.

Create a model

model = client.create_model(
  name="<model_name>",
  ontology_id="<ontology_id>"
)

Get a model

model = client.get_model("<model_id>")

# from labelbox import Model
models = client.get_models(where=(Model.name == "<model_name>"))

Methods

Create a model run

model.create_model_run(name="<model_run_name>"

# optionally, you can supply details of the model run config
model.create_model_run(
  name="<model_run_with_config>",
  config={
    "learning_rate": 0.001,
    "batch_size": 32
  }
)

Delete a model, and its model runs

Deleting a model also deletes its model runs.

This action is permanent; it cannot be undone or rolled back.

model.delete()

Attributes

Get the basics

# name (str)
model.name

Get the model runs

# get the model runs (relationship to ModelRun objects)
model_runs = model.model_runs()

# inspect one model run
next(model_runs)

# inspect all model runs
for model_run in model_runs:
  print(model_run)

Model run

A model run represents a single iteration within a model training experiment. Each model run contains a versioned snapshot of the data rows, annotations (predictions and/or ground truth), and data splits for each iteration within a model training experiment.

Model runs make it easy for you to reproduce a model training experiment using different parameters. You can also use model runs to track and compare model runs trained on different data versions.

Get all model runs inside a Model

model_runs = model.model_runs()

Create a model run

Creates a model run belonging to this model.

model_run_name = "<your_model_run_name>"
example_config = {
    "learning_rate": 0.001, 
    "batch_size": 32, 
}
model_run = model.create_model_run(name=model_run_name, config=example_config)

Get model run

model_run_id = "<your_model_run_id>"
model_run = client.get_model_run(model_run_id=model_run_id)

model_run_data = model_run.model_run_data_rows()
model_run_config = model_run.get_config())

Add data rows to a model run

Add data rows to a model run without any associated labels. You can use either data_row_id or global_key to specify the data rows.

# Using data row ids 
data_row_ids = ["<data_row_id_1>", "<data_row_id_2>"]
model_run.upsert_data_rows(data_row_ids=data_row_ids)

# Using global keys 
global_keys = ["<global_key1>", "<global_key_2>"]
model_run.upsert_data_rows(global_keys=global_keys)

Assign data row training, validation, and test split

Note that assign_data_rows_to_split only works on data rows or labels that are already in a model run. You can assign them to one of "TRAINING", "VALIDATION", "TEST" split.

client.enable_experimental=True

dataset = client.get_dataset("<Dataset_id>") # Your training dataset 

# using data row ids 
model_run.assign_data_rows_to_split(
  data_row_ids=data_row_ids[:100],
  split="TRAINING",
)
model_run.assign_data_rows_to_split(
  data_row_ids=data_row_ids[100:150],
  split="VALIDATION",
)
model_run.assign_data_rows_to_split(
  data_row_ids=data_row_ids[150:200],
  split="TEST",
)

# using global keys 
model_run.assign_data_rows_to_split(
  global_keys=global_keys[:100],
  split="TRAINING",
)
model_run.assign_data_rows_to_split(
  global_keys=global_keys[100:150],
  split="VALIDATION",
)
model_run.assign_data_rows_to_split(
  global_keys=global_keys[150:200],
  split="TEST",
)


Add labels to a model run

Adds data rows and labels to a model run. By adding labels, the associated data rows will also be upserted to the model run.

# upsert using label ids 
label_ids = ["<label_id_1>","<label_id_2>", ...]
model_run.upsert_labels(label_ids)

Alternatively, you can add all labels from a project to a Model run directly. This will also add all data rows from that project to the model run.

# upsert using project id
model_run.upsert_labels(project_id=<project_id>)

Export labels from a model run

See Export for Model Runs for more details and export JSON format.

# Set the export params to include/exclude certain fields. Make sure each of these fields are correctly grabbed 
export_params= {
    "attachments": True,
    "metadata_fields": True,
    "data_row_details": True,
}

export_task = model_run.export(params=export_params)
export_task.wait_till_done()

# Conditional for errors
if export_task.has_errors():
  export_task.get_buffered_stream(
    stream_type=lb.StreamType.ERRORS
  ).start(stream_handler=lambda error: print(error))

if export_task.has_result():
  stream = export_task.get_buffered_stream()

  # iterate through data rows
  for data_row in stream:
    print(data_row.json)

Create, modify, and delete model run config to track hyperparameters.

example_config = {
    "learning_rate": 0.001, 
    "checkpoint_path": "/path/to/checkpoint/file",
    "early_stopping": False,
    "batch_size": 32, 
    "optimizer": { 
      "adam": {
        "beta1": 0.899999976158,
        "beta2": 0.999000012875,
        "epsilon": 9.99999993923e-9
      }
    }, 
    "ngpu": 1, 
    }

model_run_1 = model.create_model_run(name="run 1", config=example_config)
# You can also create a model with config specified, see above.
# Here is how to create a model run first and update the model config field.
model_run_2 = model.create_model_run(name="run 2")
#The update will repace the previous model run config with the new json input.
model_run_2.update_config(example_config)

Get model run config

model_run_parameters = model_run.get_config()

Delete the model run config

model_run.reset_config()

Delete data rows from a model run

data_row_ids = ["<data_row_id_1>","<data_row_id_2>", ...]
model_run.delete_model_run_data_rows(data_row_ids=data_row_ids)

Delete model run

model_run.delete()