Train model

An overview of the model training integration.

With the model training integration, you can now train a model from your labeled data in Labelbox with just one click. This feature also comes with a set of model and data management tools that allow you to track model and data versioning, manage data splits, and improve model reproducibility.

With the model training integration, you can:

  • Quickly kick off low-code model development.

  • Validate a hypothesis or business case with minimal ML effort.

  • Train a model to accelerate data labeling.



The model training integration is currently available for pro and enterprise customers.

How does the model training integration work?

The model training pipeline lets you run ETL jobs, train models, deploy models, and track model performance all from a single service.


Reference architecture of the model training integration.

You can deploy the model training service to your cloud account to receive requests from Labelbox for training. Labelbox will then launch a sequence of jobs in a pipeline and report back model results. The model training service only has to be deployed on the cloud once, then you can access the model integration via the Labelbox model training UI.

You can integrate your model training services with Labelbox following the guide in model training service API integration.

Labelbox provides a reference implementation of training service integrations based on GCP’s VertexAI. See the example in Train a model with model training integration. Once you set up Google Vertex AutoML service, you will be able to kick off model training jobs on Vertex from the Labelbox UI. You can also follow the instructions here to customize a model training pipeline based on Labelbox reference implementation.

Which models are supported?

Labelbox provides you with two options for selecting a model architecture:

  1. Select a model architecture from a list of supported reference models (see the table below).
  2. Set up your own custom model training pipeline hosted on your cloud provider. Visit our GitHub repository for a reference architecture code that you can modify and reuse.

Reference models

To help you jumpstart your model training, Labelbox provides you with a list of out-of-the-box machine learning models offered by Vertex AutoML.

Below are the supported data types and tasks for model training with Labelbox.

Data TypesTasks
ImagesObject detection
Single-class classification
Multi-class classification
TextNamed entity recognition
Single-class classification
Multi-class classification

Custom model training

To use your own custom model for model training, visit our GitHub repository to check out the reference architecture implementation and add your custom model pipeline using the instructions in Customize a model training pipeline based on Labelbox reference implementation

Automatically connect to model error analysis tools

Model training is automatically connected to our model error analysis tools, allowing you to access powerful model evaluation tools without writing a single line of code. Once your model training is complete, you can instantly visualize your model’s evaluation and performance on each data row, find errors, and identify what to improve next.

Open source models

To try out the Model product functionalities, you can navigate to the Model tab and select one of the open-source models.

Each open-source model contains multiple model runs. You will see seeded labels, model predictions, and model metrics in these model runs.

Trying out these open-source models will not impact your usage or billing.