A developer guide for creating and managing model experiments.
Model
An experiment is a container in Labelbox that houses all of the information related to the iterative development of a specific model. It contains the data rows for training, model error analysis metrics, model versioning, and versioned snapshots (model runs) of data rows, predictions, etc, associated with a model’s development.
Experiments are designed to help you track and compare all of the iterations associated with your model development.
Create a model
model = client.create_model(
name="<model_name>",
ontology_id="<ontology_id>"
)
Get a model
model = client.get_model("<model_id>")
# from labelbox import Model
models = client.get_models(where=(Model.name == "<model_name>"))
Methods
Create a model run
model.create_model_run(name="<model_run_name>"
# optionally, you can supply details of the model run config
model.create_model_run(
name="<model_run_with_config>",
config={
"learning_rate": 0.001,
"batch_size": 32
}
)
Delete a model, and its model runs
Deleting a model also deletes its model runs.
This action is permanent; it cannot be undone or rolled back.
model.delete()
Attributes
Get the basics
# name (str)
model.name
Get the model runs
# get the model runs (relationship to ModelRun objects)
model_runs = model.model_runs()
# inspect one model run
next(model_runs)
# inspect all model runs
for model_run in model_runs:
print(model_run)
Model run
A model run represents a single iteration within a model training experiment. Each model run contains a versioned snapshot of the data rows, annotations (predictions and/or ground truth), and data splits for each iteration within a model training experiment.
Model runs make it easy for you to reproduce a model training experiment using different parameters. You can also use model runs to track and compare model runs trained on different data versions.
Get all model runs inside a Model
model_runs = model.model_runs()
Create a model run
Creates a model run belonging to this model.
model_run_name = "<your_model_run_name>"
example_config = {
"learning_rate": 0.001,
"batch_size": 32,
}
model_run = model.create_model_run(name=model_run_name, config=example_config)
Get model run
model_run_id = "<your_model_run_id>"
model_run = client.get_model_run(model_run_id=model_run_id)
model_run_data = model_run.model_run_data_rows()
model_run_config = model_run.get_config())
Add data rows to a model run
Add data rows to a model run without any associated labels. You can use either data_row_id
or global_key
to specify the data rows.
# Using data row ids
data_row_ids = ["<data_row_id_1>", "<data_row_id_2>"]
model_run.upsert_data_rows(data_row_ids=data_row_ids)
# Using global keys
global_keys = ["<global_key1>", "<global_key_2>"]
model_run.upsert_data_rows(global_keys=global_keys)
Assign data row training, validation, and test split
Note that assign_data_rows_to_split
only works on data rows or labels that are already in a model run. You can assign them to one of "TRAINING", "VALIDATION", "TEST" split.
client.enable_experimental=True
dataset = client.get_dataset("<Dataset_id>") # Your training dataset
# using data row ids
model_run.assign_data_rows_to_split(
data_row_ids=data_row_ids[:100],
split="TRAINING",
)
model_run.assign_data_rows_to_split(
data_row_ids=data_row_ids[100:150],
split="VALIDATION",
)
model_run.assign_data_rows_to_split(
data_row_ids=data_row_ids[150:200],
split="TEST",
)
# using global keys
model_run.assign_data_rows_to_split(
global_keys=global_keys[:100],
split="TRAINING",
)
model_run.assign_data_rows_to_split(
global_keys=global_keys[100:150],
split="VALIDATION",
)
model_run.assign_data_rows_to_split(
global_keys=global_keys[150:200],
split="TEST",
)
Add labels to a model run
Adds data rows and labels to a model run. By adding labels, the associated data rows will also be upserted to the model run.
# upsert using label ids
label_ids = ["<label_id_1>","<label_id_2>", ...]
model_run.upsert_labels(label_ids)
Alternatively, you can add all labels from a project to a Model run directly. This will also add all data rows from that project to the model run.
# upsert using project id
model_run.upsert_labels(project_id=<project_id>)
Export labels from a model run
See Export for Model Runs for more details and export JSON format.
# Set the export params to include/exclude certain fields. Make sure each of these fields are correctly grabbed
export_params= {
"attachments": True,
"metadata_fields": True,
"data_row_details": True,
}
export_task = model_run.export(params=export_params)
export_task.wait_till_done()
# Conditional for errors
if export_task.has_errors():
export_task.get_buffered_stream(
stream_type=lb.StreamType.ERRORS
).start(stream_handler=lambda error: print(error))
if export_task.has_result():
stream = export_task.get_buffered_stream()
# iterate through data rows
for data_row in stream:
print(data_row.json)
Create, modify, and delete model run config to track hyperparameters.
example_config = {
"learning_rate": 0.001,
"checkpoint_path": "/path/to/checkpoint/file",
"early_stopping": False,
"batch_size": 32,
"optimizer": {
"adam": {
"beta1": 0.899999976158,
"beta2": 0.999000012875,
"epsilon": 9.99999993923e-9
}
},
"ngpu": 1,
}
model_run_1 = model.create_model_run(name="run 1", config=example_config)
# You can also create a model with config specified, see above.
# Here is how to create a model run first and update the model config field.
model_run_2 = model.create_model_run(name="run 2")
#The update will repace the previous model run config with the new json input.
model_run_2.update_config(example_config)
Get model run config
model_run_parameters = model_run.get_config()
Delete the model run config
model_run.reset_config()
Delete data rows from a model run
data_row_ids = ["<data_row_id_1>","<data_row_id_2>", ...]
model_run.delete_model_run_data_rows(data_row_ids=data_row_ids)
Delete model run
model_run.delete()