LLM Response Evaluation
Import conversation & model response data
How to import conversation text data & model response data.
Specifications
File format: JSON in our conversation format Encoding: UTF-8 (Note: The Editor does not process any special character sequences like HTML Entities, Unicode Escape Sequence, or colon emoji aliases.)
Import methods:
- IAM Delegated Access
- Signed URLs (
https
URLs only)
When importing conversation or thread data to Labelbox, your JSON file should include the following information for each data row.
Parameter | Required | Description |
---|---|---|
row_data | Yes | https path to a cloud-hosted conversational text JSON file. See the section below for details on our conversation format. For IAM Delegated Access, this URL must be in virtual-hosted-style format. For older regions, your S3 bucket may be in https://<bucket-name>.s3.<region>.amazonaws.com/<key> format. If your object URLs are formatted this way, make sure they are in the virtual-hosted-style format before importing. |
global_key | No | Unique user-generated file name or ID for the file. Global keys are enforced to be unique in your org. Data rows will not be imported if their global keys are duplicated to existing data rows. |
media_type | No | "CONVERSATIONAL" (optional media type to provide better validation and error messaging) |
metadata_fields | No | See Metadata. |
Import format
Python example
Model Response Conversation JSON format
Each data row should have its own cloud-hosted JSON file. Each JSON file should include the following.
Parameter | Required | Description |
---|---|---|
type | Yes | This should always be populated with application/vnd.labelbox.conversational |
version | Yes | This should be populated with 1 |
messages | Yes | This is the array of Message Objects that makes up the conversation. Please see below for the fields that must be included for each message. The current maximum is 250 messages in a conversation |
modelOutputs | Yes | This is the array of Model outputs that makes up the model output to the conversation. See the section below for fields that must be included. |
Message object
Parameter | Required | Description |
---|---|---|
messageId | Yes | This is the external ID for the message. This must be unique within the conversation file and will be returned on the exports to help match annotations to message. |
timestampUsec | No | The unix epoch timestamp of the message. |
content | Yes | This field contains the text string of the conversation. This must be under 10000 characters. |
user | Yes | This field is used to describe the person that is speaking. The user is defined by two fields: userId and name. The name field is used to show the user name in the editor. { "userId": "ID", "name": "Name" } |
align | No | This field Controls the alignment and indentation of the text string in the editor. If left blank, we will default to left . Options:left``0-left-indent``1-left-indent``2-left-indent``3-left-indent``4-left-indent``5-left-indent``right``0-right-indent``1-right-indent``2-right-indent``3-right-indent``4-right-indent``5-right-indent |
Model outputs object
Parameter | Required | Description |
---|---|---|
title | Yes | Use this field to name the model response |
content | Yes | This field displays the content of the model output |
modelConfigName | Yes | Use this field to specify the corresponding model config name |
Sample Conversation JSON
Sample note
This sample contains modelOutputs
field which is unique to importing data rows for LLM human preference.