Train models

An overview of the Model Training integration.

With the Model Training integration, you can now train a model from your labeled data in Labelbox with one click. This feature also comes with a set of model and data management tools that let you track model and data versioning, manage data splits, and improve model reproducibility.

With Model Training integration, you can:

  • Quickly kick off low-code model development

  • Validate a hypothesis or business case with minimal ML effort

  • Train a model to accelerate data labeling

📘

Availability

Currently available for Pro and Enterprise customer.

How does the Model Training integration work?

23382338

A reference architecture of model training integration

The Model Training pipeline lets you run ETL jobs, train models, deploy models, and track model performance all from a single service. You can deploy the Model Training service to your cloud account or customer-managed infra cluster to receive requests from Labelbox for training. Labelbox will then launch a sequence of jobs in a pipeline and report back model results. The Model Training service only has to be deployed on the cloud once, then you can access the model integration via the Labelbox Model Training UI.

You can integrate your model training services with Labelbox following the guide in model training service API integration.

Labelbox provides a reference implementation of training service integration based on GCP’s VertexAI. See the example in Train a model with model training integration. Once you set up Google Vertex AutoML service, you will be able to kick off model training jobs on Vertex from the Labelbox UI. You can also follow the instructions here to customize a model training pipeline based on Labelbox reference implementation.

Which models are supported?

Labelbox provides you with two options for selecting a model architecture.

  1. Select a model architecture from a list of supported Reference models (see the table below).
  2. Set up your own custom model training pipeline hosted on your cloud provider. Visit our Github repo for a reference architecture code that you can modify and re-use.

Reference models

To help you jumpstart your model training, Labelbox provides you with a list of out-of-the-box machine learning models offered by Vertex AutoML.

Below are the supported data types and tasks for Model Training with Labelbox.

Data Types

Tasks

Images

Object detection
Single-class classification
Multi-class classification

Text

Named entity recognition
Single-class classification
Multi-class classification

Custom model training

To use your own custom model for Model Training, visit our Github repo to check out the reference architecture implementation and add your custom model pipeline using instructions in Customize a model training pipeline based on Labelbox reference implementation

Automatically connect to Model Diagnostics

Model Training is automatically connected to our Model diagnostics tool, allowing you to access powerful model evaluation tools without writing a single line of code. Once your model training is complete, you can instantly visualize your model’s evaluation and performance on each Data Row, find errors, and identify what to improve next.


Did this page help you?