> ## Documentation Index
> Fetch the complete documentation index at: https://docs.labelbox.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Import multimodal chat data

> How to import multimodal chat data and sample import formats.

You need to set up a multimodal chat evaluation project before importing data. The two types of multimodal chat evaluation projects have different project creation methods and data row setups:

* For **offline multimodal chat evaluation** projects, use `create_offline_model_evaluation_project` and import data rows of existing conversations.

* For **live multimodal chat evaluation projects**, use `client.create_model_evaluation_project` and either:

  * (Recommended) Create data rows and send them to projects, like other types of projects.
  * Generate empty data rows upon project creation, which can't create data rows with attachments and metadata.

For a full walk-through of setting up a multimodal chat evaluation project, see [Multimodal chat evaluation](/docs/multimodal-chat-evaluation-editor).

## Set up live multimodal chat evaluation projects

Use `client.create_model_evaluation_project` to create a live multimodal chat evaluation project. This method takes the same parameters as the traditional `client.create_project`, with a few additional parameters specific to multimodal chat evaluation projects.

The `client.create_model_evaluation_project` methods require the following parameters:

* `name`: The name of your new project.

* `description`: An optional description of your project.

* `dataset_name` (optional): The name of the dataset where the generated data rows will be located. Include this parameter only if you want to create a new dataset.

* `dataset_id` (optional): The dataset ID of an existing Labelbox dataset. Include this parameter if you want to append it to an existing dataset.

* `data_row_count` (optional): The number of data row assets that will be generated and used with your project. Defaults to 100 if a `dataset_name` or `dataset_id` is included.

### Option A: Create and send data rows to projects

<CodeGroup>
  ```python Python expandable theme={null}
  # Create the project
  project = client.create_model_evaluation_project(
      name="Example live multimodal chat project",
      description="<project_description>",  # optional
  )

  def make_data_rows(dataset_id=None):
      # If a dataset ID is provided, fetch the dataset using that ID.
      # Otherwise, create a new dataset with the specified name.
      if dataset_id:
          dataset = client.get_dataset(dataset_id)
      else:
          dataset = client.create_dataset(name="example live mmc dataset")

      # Helper function to generate a single data row
      def generate_data(ind):
          return {
              "row_data": {  # The chat data format
                  'type': 'application/vnd.labelbox.conversational.model-chat-evaluation',
                  'draft': True,
                  'rootMessageIds': [],
                  'actors': {},
                  'version': 2,
                  'messages': {}
              },
              "global_key": f"global_key_{dataset.uid}_{ind}",
              "metadata_fields": [{"name": "tag", "value": "val_tag"}],
              "attachments": [
                  {
                      "type": "IMAGE_OVERLAY",
                      "value": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/rgb.jpg"
                  }
              ]
          }

      # Generate a list of 100 data rows
      data_list = [generate_data(ind) for ind in range(100)]

      # Upload the generated data rows to the dataset
      task = dataset.create_data_rows(data_list)
      print("Processing task ", task.uid)  # Print the unique ID of the task
      task.wait_till_done()

      # Ensure that the task status is 'COMPLETE' to confirm success
      assert task.status == "COMPLETE"

      # Return the dataset object
      return dataset

  # Create a new data set. Alternatively, pass an existing dataset ID
  dataset = make_data_rows()

  # Retrieve the data row IDs from the dataset
  data_row_ids = [data_row.uid for data_row in dataset.data_rows()]

  # Send data rows to the project
  batch = project.create_batch(
      name="mmc-batch",  # each batch in a project must have a unique name
      data_rows=data_row_ids, # data row IDs to include in the batch
      priority=1  # priority between 1(highest) - 5(lowest)
  )

  print(f"Batch: {batch}")
  ```
</CodeGroup>

### Option B: Generate empty data rows

<Info>
  ### No metadata support

  Only use this option if your project doesn't require metadata attachments or embeddings for data rows.
</Info>

<CodeGroup>
  ```python Python theme={null}
  # Create the project and generate data rows
  project = client.create_model_evaluation_project(
      name="Example live multimodal chat project",
      description="<project_description>",  # optional
      dataset_name="Example live multimodal chat dataset",
      data_row_count=100,
  )

  # Connect the project to the created ontology
  project.connect_ontology(ontology)
  ```
</CodeGroup>

## Set up offline multimodal chat evaluation projects

Use `client.create_offline_model_evaluation_project` to create offline multimodal chat evaluation projects. This method uses the same parameters as `client.create_project` and adds validation to ensure the project is set up correctly.

<CodeGroup>
  ```python Python theme={null}
  project = client.create_offline_model_evaluation_project(
      name="<project_name>",
      description="<project_description>",  # optional
  )
  ```
</CodeGroup>

After creating the project, you can import conversational version 2 data rows to the project. To learn how to import annotations, see [Import multimodal chat annotations](/reference/import-multimodal-chat-annotations).

## Specifications

File format: chat data JSON in [conversation v2 format](#sample-conversation-v2-json)

Import methods:

* Local upload (maximum character count: 2,621,440)
* IAM Delegated Access
* Signed URLs (`https` URLs only)

When importing conversation or thread data to Labelbox, include the following information for each data row in your JSON file.

| Parameter         | Required | Description                                                                                                                                                                                                                         |
| ----------------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `row_data`        | Yes      | `https` path to a cloud-hosted conversational text JSON file. See the section below for details on our conversation format.                                                                                                         |
| `global_key`      | No       | Unique user-generated file name or ID for the file. [Global keys](/reference/data-row-global-keys) are enforced to be unique in your org. Data rows will not be imported if their global keys are duplicated to existing data rows. |
| `media_type`      | No       | `"CONVERSATIONAL"` (optional media type to provide better validation and error messaging)                                                                                                                                           |
| `metadata_fields` | No       | See [metadata](/docs/datarow-metadata)                                                                                                                                                                                              |

## Import format

<CodeGroup>
  ```json Local upload expandable theme={null}
  {
    "row_data": {
      "type": "application/vnd.labelbox.conversational.model-chat-evaluation",
      "version": 2,
      "actors": {
          "cm1qu8krf00063b72cutnbn5l": {
          "role": "human",
          "metadata": { "name": "User" }
          },
          "cm1vjleif00023b6y4fw4ew94": {
          "role": "model",
          "metadata": {
              "modelConfigName": "Gem Pro-Copy"
          }
          },
          "cm1vjleif00033b6yifzroser": {
          "role": "model",
          "metadata": {
              "modelConfigName": "gpt 4-Copy"
          }
          }
      },
      "messages": {
          "cm1qu8krf00073b72fyar00vh": {
          "actorId": "cm1qu8krf00063b72cutnbn5l",
          "content": [{ "type": "text", "content": "Hello " }],
          "childMessageIds": [
              "cm1vjlitg00043b6y1tgssq1r",
              "cm1vjlitg00053b6y19ve1qra"
          ]
          },
          "cm1vjlitg00043b6y1tgssq1r": {
          "actorId": "cm1vjleif00023b6y4fw4ew94",
          "content": [
              {
              "type": "text",
              "content": "Hello! 👋 How can I assist you today? 😊 \\n"
              }
          ],
          "childMessageIds": []
          },
          "cm1vjlitg00053b6y19ve1qra": {
          "actorId": "cm1vjleif00033b6yifzroser",
          "content": [
              { "type": "text", "content": "Hi! How can I assist you today?" }
          ],
          "childMessageIds": []
          }
      },
      "rootMessageIds": ["cm1qu8krf00073b72fyar00vh"]
      },
    "global_key": "global_key"
  }
  ```

  ```json Cloud storage theme={null}
  [
      {
        "row_data": "https://storage.googleapis.com/labelbox-datasets/conversational-sample-data/pairwise_shopping_1.json",
        "global_key": "global_key_1"
      },
      {
          "row_data": "https://storage.googleapis.com/labelbox-datasets/conversational-sample-data/pairwise_shopping_2.json",
          "global_key": "global_key_2"
      },
      {
          "row_data": "https://storage.googleapis.com/labelbox-datasets/conversational-sample-data/pairwise_shopping_3.json",
          "global_key": "global_key_3"
      }
  ]
  ```
</CodeGroup>

## Python example

<CodeGroup>
  ```python Local upload expandable theme={null}
  # Embed the chat conversation data
  row_data = {
      "type": "application/vnd.labelbox.conversational.model-chat-evaluation",
      "version": 2,
      "actors": {
          "cm1qu8krf00063b72cutnbn5l": {
              "role": "human",
              "metadata": { "name": "User" }
          },
          "cm1vjleif00023b6y4fw4ew94": {
              "role": "model",
              "metadata": {
                  "modelConfigName": "Gem Pro-Copy"
              }
          },
          "cm1vjleif00033b6yifzroser": {
              "role": "model",
              "metadata": {
                  "modelConfigName": "gpt 4-Copy"
              }
          }
      },
      "messages": {
          "cm1qu8krf00073b72fyar00vh": {
              "actorId": "cm1qu8krf00063b72cutnbn5l",
              "content": [{ "type": "text", "content": "Hello " }],
              "childMessageIds": [
                  "cm1vjlitg00043b6y1tgssq1r",
                  "cm1vjlitg00053b6y19ve1qra"
              ]
          },
          "cm1vjlitg00043b6y1tgssq1r": {
              "actorId": "cm1vjleif00023b6y4fw4ew94",
              "content": [
                  {
                      "type": "text",
                      "content": "Hello! 👋 How can I assist you today? 😊 \\n"
                  }
              ],
              "childMessageIds": []
          },
          "cm1vjlitg00053b6y19ve1qra": {
              "actorId": "cm1vjleif00033b6yifzroser",
              "content": [
                  { "type": "text", "content": "Hi! How can I assist you today?" }
              ],
              "childMessageIds": []
          }
      },
      "rootMessageIds": ["cm1qu8krf00073b72fyar00vh"]
  }

  # Create a dataset
  dataset = client.create_dataset(
      name="mmc_dataset",
  )

  # Upload the conversation data to the dataset as a data row.
  task = dataset.create_data_rows([{"row_data": row_data}])
  task.wait_till_done()

  # Output any errors that occurred during the import.
  print("Errors:", task.errors)
  print("Failed data rows:", task.failed_data_rows)
  ```

  ```python Cloud storage theme={null}
  # Generate dummy global keys
  global_key_1 = str(uuid.uuid4())
  global_key_2 = str(uuid.uuid4())
  global_key_3 = str(uuid.uuid4())

  # Create a dataset
  dataset = client.create_dataset(
      name="pairwise_demo_"+str(uuid.uuid4()),
      iam_integration=None
  )
  # Upload data rows
  task = dataset.create_data_rows([
      {
        "row_data": "https://storage.googleapis.com/labelbox-datasets/conversational-sample-data/pairwise_shopping_1.json",
        "global_key": global_key_1
      },
      {
          "row_data": "https://storage.googleapis.com/labelbox-datasets/conversational-sample-data/pairwise_shopping_2.json",
          "global_key": global_key_2
      },
      {
          "row_data": "https://storage.googleapis.com/labelbox-datasets/conversational-sample-data/pairwise_shopping_3.json",
          "global_key": global_key_3
      }
    ])
  task.wait_till_done()
  print("Errors:",task.errors)
  print("Failed data rows:", task.failed_data_rows)
  ```
</CodeGroup>

## Conversation v2 JSON

| Parameter        | Required | Description                                                                                  |
| ---------------- | -------- | -------------------------------------------------------------------------------------------- |
| `type`           | Yes      | Populate with `application/vnd.labelbox.conversational.model-chat-evaluation`                |
| `version`        | Yes      | Populate with `2`                                                                            |
| `actors`         | Yes      | An object of [actors](#actor-object) of the chat conversation.                               |
| `messages`       | Yes      | An object of messages from each actor.                                                       |
| `rootMessageIds` | Yes      | An array of message ids. You would include the id of first message given from a human actor. |

### Actor object

Actor objects start with a key value of a unique user given id.

Each actor object has a `role` key and a `metadata` key. The metadata contains the specifics of the actor and will vary depending on the actor's role.

| Parameter         | Required | Description                                                                                                                    |
| ----------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------ |
| `role`            | Yes      | The role the actor receives. Either `human` or `model`.                                                                        |
| `name`            | No       | The name of the actor. This is applicable and required for actors with the human role. Placed inside the `metadata` actor key. |
| `modelConfigName` | Yes      | The model config name of the actor. This is required for actors with the model role. Placed inside the `metadata` actor key.   |

### Message object

Message objects start with a key value of a unique user given id.

| Parameter         | Required | Description                                                                                                                                                                                                     |
| ----------------- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `actorId`         | Yes      | The id of the actor who produced the message.                                                                                                                                                                   |
| `content`         | Yes      | An array of content for the message. See [message content](#message-content).                                                                                                                                   |
| `childMessageIds` | No       | An array of message ids that are children of the message object. Typically this would be the next series of messages. If you were comparing more then one model response, multiple message ids can be included. |

#### Message content

| Parameter        | Required | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| ---------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`           | Yes      | The type of message. This will be `fileData` for attachments, `text` for raw text, and `dataRowAttachment` for attachments on data rows.                                                                                                                                                                                                                                                                                                                                                                         |
| `content`        | No       | The raw text content of your message. This field supports markdown. This field is used for `text` type messages.                                                                                                                                                                                                                                                                                                                                                                                                 |
| `fileUri`        | No       | `https` path to a public cloud-hosted attachment file. This field is used for `fileData` type messages. If you want to use [IAM delegated access](/docs/iam-delegated-access) to store conversation files, you should first add them as data row attachments. See [attachments](/reference/attachments) on how to add an attachment to a data row. After you add your attachments to your data row, you can use the `type` and `attachmentName` keys to include your attachment inside your conversational data. |
| `attachmentName` | No       | The name of the attachment on the data row.                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| `mimeType`       | No       | The `mimeType` of your attachment `fileUri` data. The following types are supported: - `video/mp4` - `image/png` - `application/pdf`                                                                                                                                                                                                                                                                                                                                                                             |

#### Embed images

You can either embed images directly in the message `content` or add them as attachments.\
For `dataRowAttachment`, the value of `attachmentName` must exist in the `attachments` section.

<CodeGroup>
  ```json Embedded image content theme={null}
  // Message
  {
    "actorId": "",
    "childMessageIds": [],
    "content": [
       {
          "type": "text",
          "content": f"<img title= {model_name2} alt={model_name2} src={model_url2}>"
       }
    ]
  }
  ```

  ```json Image attachment theme={null}
  // Message
  {
    "actorId": "",
    "childMessageIds": [],
    "content": [
       {
          "type": "text",
          "content": "What do you see in this image?"
       },
       {
          "type": "fileData",
          "fileUri": "https://link-to-my-image",
          "mimeType": "image/png"
       },
       {
          "type": "dataRowAttachment",
          "attachmentName": "attachment_name_1",
       }       
    ]
  }
  ```
</CodeGroup>

## Sample conversation v2 JSON

<Info>
  ### SDK import only

  You can't upload the following file from the web interface directly. You must use an import file as described in [Import format](#import-format).
</Info>

<CodeGroup>
  ```json Sample conversation v2 json expandable theme={null}
      "type": "application/vnd.labelbox.conversational.model-chat-evaluation",
      "version": 2,
      "actors": {
          "actor1": {
              "role": "human",
              "metadata": {
                  "name": "User"
              }
          },
          "actor2": {
              "role": "model",
              "metadata": {
                  "modelConfigName": "Model 1"
              }
          },
          "actor3": {
              "role": "model",
              "metadata": {
                  "modelConfigName": "Model 2"
              }
          }
      },
      "messages": {
          "clxcboi1e00053p6n0ya733nn": {
              "actorId": "actor1",
              "content": [
                  {
                      "type": "text",
                      "content": "What's in the images?"
                  },
                  {
                      "type": "fileData",
                      "fileUri": "https://storage.googleapis.com/labelbox-developer-testing-assets/image/giraffe-combunetes.png",
                      "mimeType": "image/png"
                  },
                  {
                      "type": "fileData",
                      "fileUri": "https://storage.googleapis.com/labelbox-developer-testing-assets/image/label-blocks.png",
                      "mimeType": "image/png"
                  },
                  {
                      "type": "fileData",
                      "fileUri": "https://storage.googleapis.com/labelbox-developer-testing-assets/image/tim-test.png",
                      "mimeType": "image/png"
                  }
              ],
              "childMessageIds": [
                  "clxcboue900083p6no6emql83",
                  "clxcboue900093p6nrepe8jjd"
              ]
          },
          "clxcboue900083p6no6emql83": {
              "actorId": "actor2",
              "content": [
                  {
                      "type": "text",
                      "content": "The images show:\n 1. **A giraffe in an office:**  The giraffe is standing in front of a desk with a computer showing code on the screen. There's also a whiteboard with diagrams about Kubernetes."
                  }
              ],
              "childMessageIds": [
                  "clxmrt0hh00023p6qykkdaqtk"
              ]
          },
          "clxcboue900093p6nrepe8jjd": {
              "actorId": "actor3",
              "content": [
                  {
                      "type": "text",
                      "content": "1. The first image shows a giraffe standing in an office environment, interacting with a computer. The giraffe is looking at a large monitor displaying code, and there is a whiteboard in the background with various diagrams and notes.\n2."
                  }
              ],
              "childMessageIds": [
                  "clxmrt0hh00023p6qykkdaqtk"
              ]
          },
          "clxmrt0hh00023p6qykkdaqtk": {
              "actorId": "actor1",
              "content": [
                  {
                      "type": "text",
                      "content": "What's in this PDF file?"
                  },
                  {
                      "type": "fileData",
                      "fileUri": "https://storage.googleapis.com/labelbox-developer-testing-assets/pdf/data_files/south-platte-tree-map.pdf",
                      "mimeType": "application/pdf"
                  }
              ],
              "childMessageIds": [
                  "clxmrtgxg00033p6qqzl2596o",
                  "clxmrtgxg00043p6qiehsvww4"
              ]
          },
          "clxmrtgxg00033p6qqzl2596o": {
              "actorId": "actor2",
              "content": [
                  {
                      "type": "text",
                      "content": "This PDF file contains directions and maps for Christmas tree cutting areas managed by the South Platte Ranger District.  Here's a breakdown:\n**Page 1: Directions**\n* Provides written directions to three Christmas tree cutting areas:\n    **Buffalo Creek:** Located southeast of Pine Junction.\n    **Camp Fickes:** Located west of Buffalo Creek.\n    **Sugar Creek:** Located along County Road\n"
                  }
              ],
              "childMessageIds": [
                  "clxmru9j600053p6q0qh89zm4"
              ]
          },
          "clxmrtgxg00043p6qiehsvww4": {
              "actorId": "actor3",
              "content": [
                  {
                      "type": "text",
                      "content": "Via fugit referre [duasque longumque](http://luminavale.com/) fateri sumite\ncalidumque arma spatiis fuerit genialiter errore iacent; cuncta hausit memori.\nAestus a omnia nomenque inlimis captantur ipsumque fuga. Aeneadae dona tenero\nclipei tamen, sed de amor flagellari quas; corpore, grande.\n[Pectore inclinatcadunt](http://tardoset.com/uni-et.html), Hectoreis defensatque virga altera\nsecum caliturasque militia pennas."
                  }
              ],
              "childMessageIds": [
                  "clxmru9j600053p6q0qh89zm4"
              ]
          },
          "clxmru9j600053p6q0qh89zm4": {
              "actorId": "actor1",
              "content": [
                  {
                      "type": "text",
                      "content": "What have astronauts brought back from the moon?"
                  }
              ],
              "childMessageIds": [
                  "clxmrupyh00063p6q4wxj97sz",
                  "clxmrupyh00073p6qeszn06l7"
              ]
          },
          "clxmrupyh00063p6q4wxj97sz": {
              "actorId": "actor2",
              "content": [
                  {
                      "type": "text",
                      "content": "## Petebat semine\nDiurnis parsque, tanti nuper novi, extremae caede *Psophidaque spiro* dum visa.\nUsu dicebat obstet meritos."
                  }
              ],
              "childMessageIds": []
          },
          "clxmrupyh00073p6qeszn06l7": {
              "actorId": "actor3",
              "content": [
                  {
                      "type": "text",
                      "content": "## Ossa custos captabat insanis humus Cipe temptatum\nLorem markdownum adflatuque est Tydides medios. Notatas te Pandrose **solent**\npartes saucius animal certamen, plures opem corpora. Est magni duce, illiarcus: Iuno atque aderat amplexo genusque."
                  }
              ],
              "childMessageIds": []
          }
      },
      "rootMessageIds": [
          "clxcboi1e00053p6n0ya733nn"
      ]
  ```
</CodeGroup>

<Info>
  ### LaTeX support

  To add LaTeX formatting, wrap your math expressions using backticks and dollar signs. The editor supports both inline and block LaTeX formatting. For example, to add LaTeX formatting for `x=2`, put `$$x = 2$$`.
</Info>
