> ## Documentation Index
> Fetch the complete documentation index at: https://docs.labelbox.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Custom model integration

> Describes how to set up and integrate a custom model so that it can be used with Foundry.

If you are on an enterprise plan, you can integrate custom models with [Foundry](/docs/foundry) to use them to predict labels, enrich data, and generate responses for evaluation purposes. To upgrade to the enterprise plan, please [contact sales](https://labelbox.com/sales/).

## Host custom models

Before integrating your custom model, you need to deploy it on an HTTP endpoint accessible via the Internet that accepts HTTP POST calls with a JSON payload. You can host it either on your own infrastructure or through any model hosting vendor, such as [Vertex AI](https://cloud.google.com/vertex-ai/docs/general/deployment), [Databricks](https://docs.databricks.com/en/machine-learning/model-serving/create-manage-serving-endpoints.html), [Huggingface](https://huggingface.co/inference-endpoints), [Replicate](https://replicate.com/docs/how-does-replicate-work#private-models), [OpenAI](https://platform.openai.com/docs/guides/fine-tuning/use-a-fine-tuned-model).

## Create model integrations

Once you have a public HTTP endpoint for your custom model, you can create the integration:

1. On the [Models](https://app.labelbox.com/mea) page, click **Create** and select **Custom Model**.
2. Select the data type for the model.
3. Add custom model information, including:

* **Name**: A unique identifier for the model.
* **HTTP endpoint**: The URL of the HTTP endpoint hosting your model.
* **Secret** (optional): The authentication token for secret-secured endpoints only.
* **Description** (optional): The descriptive context of the model.

4. Click **Create model**.

On the **Settings** tab, you can review and edit the model information. You can add a rate limit and a Readme. To send data to your model for label prediction, click **+ Model run**. From there, you can define and [preview your model run](/docs/foundry-define-model-run), [view prediction and details](/docs/foundry-view-predictions), and [send predictions to Annotate](/docs/foundry-annotate-predictions).

<Warning>
  ### Bounding box and mask tasks not supported

  Currently, this model integration flow doesn't support tasks involving bounding box and mask annotations. To integrate a custom model for these tasks, see [Create model integrations for bounding box and mask tasks](#create-model-integrations-for-bounding-box-and-mask-tasks).
</Warning>

## Create model integrations for bounding box and mask tasks

For a custom model predicting bounding box and mask labels, you need to create a model manifest file and [contact customer solutions](https://labelbox.atlassian.net/servicedesk/customer/portal/2/group/3/create/214) to manually establish the integration. The Labelbox solutions team can help you manage the job queuing, track status, and process predictions using the Labelbox platform.

### Create manifest files

To integrate your model into the Foundry workflow, you need to specify and provide a `model.yaml` manifest file. This file stores metadata about the model, including its name, description, inference parameters, model output ontology, API endpoint, and other details. You need to create the `model.yaml` file in the following format:

<CodeGroup>
  ```yaml Example YAML manifest file theme={null}
  name: My custom model
  inference_endpoint: my_inference_endpoint # Deploy your service to an API endpoint that can be accessed
  secrets: my_secret # Your secret, API keys to be authenticated with your endpoint
  requests_per_second: 0.1 # Your estimate of requests per second
  description: My awesome custom model for object recognition
  readme: | # optional readme in markdown format
    ### Intended Use
    Object recognition model on my custom classes.
    ### Limitations
    My custom model has limitations, such as ...
    ### Citation
    ...

  allowed_asset_types: [image] # list of allowed asset types, one or more of [image, "text", "video", "html", "conversational"]
  allowed_feature_kinds: [text, radio, checklist] # list of allowed feature kinds. One or more of [text, radio, checklist, rectangle, raster-segmentation, named-entity, polygon, point, edge]

  # Only needed if your model has a predefined set of classes for classification or object detection. If your model is an LLM or takes any text input, you can remove this section.
  ontology:
    media_type: IMAGE # This example ontology has two classification classes and two object detection classes.
    classifications:
      - instructions: label
          name: label
          type: radio
          options:
          - label: tench
            value: tench
            position: 0
          - label: goldfish
            value: goldfish
            position: 1
    tools:
      - name: person
        tool: rectangle
      - name: bicycle
        tool: rectangle

  inference_params_json_schema: # hyperparmeters configured in the app and passed to your API endpoint.
    properties: # Examples follow, each with different types and defaults.
      prompt:
        description: "Prompt to use for text generation"
        type: string
        default: ""
      confidence:
        description: object confidence threshold for detection
        type: number
        default: 0.25
        minimum: 0.0
        maximum: 1.0
      max_new_tokens:
        description: Maximum number of tokens to generate. Each word is generally 2-3 tokens.
        type: integer
        default: 1024
        minimum: 100
        maximum: 4096
      use_image_attachments:
        description: Set to true if model should also process datarow attachments.
        type: boolean
        default: False
    required: # Use to specify hyperparameters that must have values for each model run.
      - prompt

  max_tokens: 1024 # only relevant for LLM to control maximum token size
  ```
</CodeGroup>

## Endpoint requests for model tasks

Every time you use your integrated custom model to predict labels or run other tasks, it sends a JSON request to your model's endpoint. The request payload provides the data row for prediction and includes the ontology and inference parameter values you selected. Here's an example request body:

<CodeGroup>
  ```json Example JSON request theme={null}
  {
   "prompt": [
     {
       "role": "system",
       "parts": [
         {
           "text": "Start each sentence with three equal signs ==="
         }
       ]
     },
     {
       "role": "user",
       "parts": [
         {
           "text": "what is in this text and image?"
         },
         {
           "text": "Hello. This is a user-provided txt file content."
         },
         {
           "image": "base64_encoded_image_string"
         }
       ]
     }
   ]
  }
  ```
</CodeGroup>

Here are descriptions of fields in the request body:

* `prompt`: Contains the current conversation with the model. For single-turn queries, it’s a single instance. For multi-turn queries, it includes conversation history and the latest request. Each `prompt` has a message structure with two properties: `role` and `parts`.
* `role`: A string indicating the individual producing the message content. Possible values include:
  * `system`: Instructions to the model.
  * `user`: User-generated message sent by a real person.
  * `assistant`: Model-generated message, used to insert responses from the model during multi-turn conversations.
* `parts`: A list of ordered parts that make up a multi-part message content. It can contain the following segments of data:
  * `text`: Text prompt or code snippet.
  * `image`: Base64 encoded image.

### Response

Responses are expected to match the format of the labels predicted by the custom model, such as a string containing the raw model response or a JSON object for NER and classifications. Here's an example JSON response with keys corresponding to feature names in the ontology:

<CodeGroup>
  ```json JSON theme={null}
  // Object Detection
  {
    "cat": {
  	// coordinate order: left, top, width, height
        "boxes": [[0, 0, 10, 10], [40, 40, 8, 10]],
        "scores": [0.9, 0.7],
    },
    "dog": {
        "boxes": [[20, 20, 5, 5]],
        "scores": [0.8],
    },
  }
  // Classification
  {
    "summary": "Tom and Bob are happy to work at IBM", // Free Text
    "sentiment": "positive",  // Radio classification
    "emotion": ["joy", "fear"], // Checklist classification
  }
  // Segmentation
  {
    "cat": {
  	// Can use pycocotools.mask.encode for RLE encoding
  	"masks": [
  		{
  			"size": [<height>, <width>],
  			"counts": "<run-length-encoded-boolean-mask>"
  }
  ]
    }
  }
  // Named Entity
  {
    "person": [
      {"start": 0, "end": 3, "text": "Tom"},
      {"start": 5, "end": 8, "text": "Bob"},
    ]
  }
  ```
</CodeGroup>
