Host custom models
Before integrating your custom model, you need to deploy it on an HTTP endpoint accessible via the Internet that accepts HTTP POST calls with a JSON payload. You can host it either on your own infrastructure or through any model hosting vendor, such as Vertex AI, Databricks, Huggingface, Replicate, OpenAI.Create model integrations
Once you have a public HTTP endpoint for your custom model, you can create the integration:- On the Models page, click Create and select Custom Model.
- Select the data type for the model.
- Add custom model information, including:
- Name: A unique identifier for the model.
- HTTP endpoint: The URL of the HTTP endpoint hosting your model.
- Secret (optional): The authentication token for secret-secured endpoints only.
- Description (optional): The descriptive context of the model.
- Click Create model.
Bounding box and mask tasks not supported
Currently, this model integration flow doesn’t support tasks involving bounding box and mask annotations. To integrate a custom model for these tasks, see Create model integrations for bounding box and mask tasks.Create model integrations for bounding box and mask tasks
For a custom model predicting bounding box and mask labels, you need to create a model manifest file and contact customer solutions to manually establish the integration. The Labelbox solutions team can help you manage the job queuing, track status, and process predictions using the Labelbox platform.Create manifest files
To integrate your model into the Foundry workflow, you need to specify and provide amodel.yaml
manifest file. This file stores metadata about the model, including its name, description, inference parameters, model output ontology, API endpoint, and other details. You need to create the model.yaml
file in the following format:
Endpoint requests for model tasks
Every time you use your integrated custom model to predict labels or run other tasks, it sends a JSON request to your model’s endpoint. The request payload provides the data row for prediction and includes the ontology and inference parameter values you selected. Here’s an example request body:-
prompt
: Contains the current conversation with the model. For single-turn queries, it’s a single instance. For multi-turn queries, it includes conversation history and the latest request. Eachprompt
has a message structure with two properties:role
andparts
. -
role
: A string indicating the individual producing the message content. Possible values include:system
: Instructions to the model.user
: User-generated message sent by a real person.assistant
: Model-generated message, used to insert responses from the model during multi-turn conversations.
-
parts
: A list of ordered parts that make up a multi-part message content. It can contain the following segments of data:text
: Text prompt or code snippet.image
: Base64 encoded image.