Skip to main content
This guide will walk you through the process of creating your first model run in Labelbox. By the end of this tutorial, you will have a new model run configured with your model’s predictions, ready for analysis.

Before you start

  • Import data rows: You’ll need to have a set of data rows to upload the predictions on. If you do not already have a set of data rows in Labelbox, you’ll need to import that data first.
  • Create an ontology: In order to create a model run, you’ll need to specify the ontology (also called taxonomy) that corresponds to the set of predictions. You may want to re-use an ontology that already exists in Labelbox (e.g., an ontology already used for a labeling project). Or, you may want to use an ontology for your model predictions that does not exist in Labelbox yet. In the latter case, you’ll need to create an ontology.

Step 1: Create an experiment and your first model run

An experiment is the top-level container for a specific modeling task (e.g., “Detecting Defects in Solar Panels”). Within that experiment, each training iteration is tracked as a model run. A model run holds a specific set of predictions and the model configuration used to generate them. You have two primary ways to create an experiment:
  • Option A: Start from Catalog
    1. Go to the Catalog tab.
    2. Use the filters to select the dataset or a subset of data rows you want to use for this experiment.
    3. Click the Manage selection button at the bottom of the screen.
    4. Select New experiment from the action menu.
  • Option B: Start from the Model tab
    1. Navigate to the Model tab.
    2. Click the + Create button in the top right and select Experiment.
    3. Select the batch of data rows you wish to include in this experiment.
Once you create the experiment, you will be immediately prompted to configure its first model run. Give your model run a descriptive name that will help you identify it later, such as YOLOv8-baseline-v1 or ResNet50-initial-training. Then, specify the ontology to use for the model run.
If you are iterating multiple model experiments on a machine learning task, the best practice is to put your model runs under the same experiment. This allows you to visualize and compare the performance of the different model runs.
Next, you will see several optional but highly recommended steps for configuring your model run. While you can adjust these settings at any time, configuring them now will significantly streamline your analysis workflow.

Optional step: Include ground truth labels

To automatically calculate performance metrics like precision, recall, and IoU, Labelbox needs to compare your model’s predictions against a “source of truth”. This step allows you to link your model run to a Labelbox Project that contains your ground truth annotations.
  • Why this is important: Without ground truth, you can visualize your model’s predictions, but you cannot quantitatively score its performance or identify where it is correct or incorrect.
  • How to do it: Simply select the project containing the relevant, reviewed labels from the dropdown menu. If you don’t have labeled data yet or if it resides in a different project, you can skip this for now and link it later from the model run settings.

Optional step: Include existing model predictions

If you have already generated a set of predictions from your model, you can associate them with this run immediately.
  • Why this is important: This step populates your model run with your model’s output, making it ready for analysis as soon as you finish the setup.
  • How to do it: The primary method for uploading predictions is via our Python SDK. In this initial setup screen, you can select an existing upload or choose to upload them in the next step. The detailed, step-by-step guide for generating and uploading this payload is covered in the next section of this tutorial.

Optional step: Define data splits

A fundamental practice in machine learning is to segment your data into Training, Validation, and Test sets. This helps you evaluate your model’s ability to generalize to new, unseen data.
  • Why this is important: Creating splits allows you to analyze your model’s performance on your validation or test data separately from the data it was trained on. This is critical for diagnosing overfitting and ensuring your model will perform well in the real world.
  • How to do it: You have two flexible options for creating splits:
    1. Split by percentage: Easily divide your data by specifying a percentage for each split (e.g., 80% training, 10% validation, 10% test). Labelbox will handle the random assignment of data rows.
    2. Use existing slices: For more control, you can assign pre-existing Slices to your splits. This is useful if you have specific, curated datasets you want to use for validation or testing.
After completing these optional steps, click Create model run. You have now created a fully configured structure for your experiment. The next step is to populate it with your model’s outputs.

Step 2: Upload predictions to the model run

This is the most critical step. Here, you will upload your model’s predictions to the model run you just created. This allows Labelbox to visualize your model’s outputs against the ground truth labels and calculate performance metrics. The most powerful and flexible way to upload predictions is by using our Python SDK.

Conceptual overview

The process involves formatting your predictions into a specific structure that Labelbox can understand and then using an SDK command to upload them. Each prediction must be linked to a specific Data Row ID to ensure it is matched with the correct source media (image, text, or video). Your predictions can be simple (e.g., a bounding box and a class name) or they can include optional information like confidence scores, which unlock more powerful analysis like building precision-recall curves.

Step-by-step guides

For detailed instructions and code examples for uploading predictions to your model via the Python SDK, please visit these pages:

Upload image predictions

Upload text predictions

Upload document predictions

Upload conversational text predictions

Upload video predictions

Upload geospatial predictions

Upload HTML predictions

Step 3: Update your model run config file

For your experiments to be scientific and reproducible, you must keep track of what changed between each model run. Labelbox allows you to store a configuration file (as JSON) with every run. This is the perfect place to log hyperparameters, model versions, or data preprocessing steps. Steps to edit the model run’s configuration file:
  1. Go to your Model Run page.
  2. Click the Settings icon and select Model run config.
  3. Edit the JSON file to include the hyperparameters for this model run.
Example configuration:
{
  "model_architecture": "YOLOv8-large",
  "training_epochs": 150,
  "learning_rate": 0.001,
  "optimizer": "Adam",
  "image_size": "640x640",
  "data_augmentation": {
    "horizontal_flip": true,
    "rotation_range": 15
  }
}
Congratulations! You have successfully created and configured your first Model Run. You are now ready to dive into the analysis tools to see what your model has learned.