Experiments

A developer guide for creating and managing model experiments.

Model

An experiment is a container in Labelbox that houses all of the information related to the iterative development of a specific model. It contains the data rows for training, model error analysis metrics, model versioning, and versioned snapshots (model runs) of data rows, predictions, etc, associated with a model’s development.

Experiments are designed to help you track and compare all of the iterations associated with your model development.

Create a model

model = client.create_model( name="<model_name>", ontology_id="<ontology_id>" )

Get a model

model = client.get_model("<model_id>") # from labelbox import Model models = client.get_models(where=(Model.name == "<model_name>"))

Methods

Create a model run

model.create_model_run(name="<model_run_name>" # optionally, you can supply details of the model run config model.create_model_run( name="<model_run_with_config>", config={ "learning_rate": 0.001, "batch_size": 32 } )

Delete a model, and its model runs

Deleting a model also deletes its model runs.

This action is permanent; it cannot be undone or rolled back.

model.delete()

Attributes

Get the basics

# name (str) model.name

Get the model runs

# get the model runs (relationship to ModelRun objects) model_runs = model.model_runs() # inspect one model run next(model_runs) # inspect all model runs for model_run in model_runs: print(model_run)

Model run

A model run represents a single iteration within a model training experiment. Each model run contains a versioned snapshot of the data rows, annotations (predictions and/or ground truth), and data splits for each iteration within a model training experiment.

Model runs make it easy for you to reproduce a model training experiment using different parameters. You can also use model runs to track and compare model runs trained on different data versions.

Get all model runs inside a Model

model_runs = model.model_runs()

Create a model run

Creates a model run belonging to this model.

model_run_name = "<your_model_run_name>" example_config = { "learning_rate": 0.001, "batch_size": 32, } model_run = model.create_model_run(name=model_run_name, config=example_config)

Get model run

model_run_id = "<your_model_run_id>" model_run = client.get_model_run(model_run_id=model_run_id) model_run_data = model_run.model_run_data_rows() model_run_config = model_run.get_config())

Add data rows to a model run

Add data rows to a model run without any associated labels. You can use either data_row_id or global_key to specify the data rows.

# Using data row ids data_row_ids = ["<data_row_id_1>", "<data_row_id_2>"] model_run.upsert_data_rows(data_row_ids=data_row_ids) # Using global keys global_keys = ["<global_key1>", "<global_key_2>"] model_run.upsert_data_rows(global_keys=global_keys)

Assign data row training, validation, and test split

Note that assign_data_rows_to_split only works on data rows or labels that are already in a model run. You can assign them to one of "TRAINING", "VALIDATION", "TEST" split.

client.enable_experimental=True dataset = client.get_dataset("<Dataset_id>") # Your training dataset # using data row ids model_run.assign_data_rows_to_split( data_row_ids=data_row_ids[:100], split="TRAINING", ) model_run.assign_data_rows_to_split( data_row_ids=data_row_ids[100:150], split="VALIDATION", ) model_run.assign_data_rows_to_split( data_row_ids=data_row_ids[150:200], split="TEST", ) # using global keys model_run.assign_data_rows_to_split( global_keys=global_keys[:100], split="TRAINING", ) model_run.assign_data_rows_to_split( global_keys=global_keys[100:150], split="VALIDATION", ) model_run.assign_data_rows_to_split( global_keys=global_keys[150:200], split="TEST", )

Add labels to a model run

Adds data rows and labels to a model run. By adding labels, the associated data rows will also be upserted to the model run.

# upsert using label ids label_ids = ["<label_id_1>","<label_id_2>", ...] model_run.upsert_labels(label_ids)

Alternatively, you can add all labels from a project to a Model run directly. This will also add all data rows from that project to the model run.

# upsert using project id model_run.upsert_labels(project_id=<project_id>)

Export labels from a model run

See Export for Model Runs for more details and export JSON format.

# Set the export params to include/exclude certain fields. Make sure each of these fields are correctly grabbed export_params= { "attachments": True, "metadata_fields": True, "data_row_details": True, } export_task = model_run.export(params=export_params) export_task.wait_till_done() # Stream the export using a callback function def json_stream_handler(output: labelbox.BufferedJsonConverterOutput): print(output.json) export_task.get_buffered_stream(stream_type=labelbox.StreamType.RESULT).start(stream_handler=json_stream_handler) # Collect all exported data into a list export_json = [data_row.json for data_row in export_task.get_buffered_stream()]

Create, modify, and delete model run config to track hyperparameters.

example_config = { "learning_rate": 0.001, "checkpoint_path": "/path/to/checkpoint/file", "early_stopping": False, "batch_size": 32, "optimizer": { "adam": { "beta1": 0.899999976158, "beta2": 0.999000012875, "epsilon": 9.99999993923e-9 } }, "ngpu": 1, } model_run_1 = model.create_model_run(name="run 1", config=example_config) # You can also create a model with config specified, see above. # Here is how to create a model run first and update the model config field. model_run_2 = model.create_model_run(name="run 2") #The update will repace the previous model run config with the new json input. model_run_2.update_config(example_config)

Get model run config

model_run_parameters = model_run.get_config()

Delete the model run config

model_run.reset_config()

Delete data rows from a model run

data_row_ids = ["<data_row_id_1>","<data_row_id_2>", ...] model_run.delete_model_run_data_rows(data_row_ids=data_row_ids)

Delete model run

model_run.delete()