Prompt and response generation
Learn how to create fine-tuning datasets of prompts and responses for LLMs.
You can prepare a dataset of prompts and responses to fine-tune large language models (LLMs). Labelbox supports dataset creation for a variety of fine-tuning tasks including summarization, classification, question-answering, and generation.
Set up prompt and response generation
When you set up a prompt and response generation project, you will be prompted to specify how you will use the editor with the following three options:
- Humans generate prompts and responses: In the editor, the prompt and response fields will be required. This will indicate to your team they should create a prompt and a response from scratch.
- Humans generate prompts: In the editor, only the prompt field will be required. This will indicate to your team that they should create a prompt from scratch.
- Humans generate responses to uploaded prompts: In the editor, a previously uploaded prompt will appear. Your team will need to create responses for that prompt.
Benchmark and consensus support
Benchmarks and Consensus are only available for the Humans generate responses to uploaded prompts option.
Specify prompts and/or responses
When setting up the project,
-
If you select Humans generate prompts, you will need to specify a prompt for your labelers to reference so they can generate more prompts. Prompts are restricted to free-form text format. You can optionally set a character minimum and maximum for prompt data.
-
If you select Humans generate prompts and responses, you will need to specify a prompt and a response for your labelers to reference so they can generate more prompts and responses. See Supported prompt formats and Supported response format for supported format types.
Markdown editor size limit
When using the Markdown editor to specify a prompt or response, limit the character count to fewer than 6,000 characters.
Import prompts
During the project setup, if you select Humans generate responses to uploaded prompts as your LLM data generation workflow, you will need to create an import file containing links to a set of prompts in text format. Then, upload your import file and send the prompts to the project.
Follow these steps to upload prompt data to Labelbox:
- Create the import file containing links to the prompts. See Import text data to learn how to structure your import file.
- Upload the import file to Labelbox.
- Save the prompts in a batch and send the batch to your project. See Batches for instructions.
Data row size limit
To view the maximum size allowed for a data row, visit our limits page.
Supported prompt formats
If you selected an LLM data generation workflow that involves generating a prompt in the editor, you will need to specify a prompt to use as the ontology. Each LLM data generation ontology is limited to one prompt.
Feature | Import format | Export format |
---|---|---|
Prompt - Free-form text | N/A | See payload |
Supported response formats
If you selected an LLM data generation workflow that involves generating a response in the editor, you will need to specify a set of responses to use as the ontology. Below are the supported formats you may include when you are specifying responses in your ontology. Responses can be applied at the global level and/or nested within other responses. LLM data generation ontologies support multiple responses.
Feature | Import format | Export format |
---|---|---|
Response - Text | N/A | See payload |
Response - Radio | N/A | See payload |
Response - Checklist | N/A | See payload |
Response - Text
Create a text response by selecting a response type of Text during ontology creation. You can optionally set a character minimum and maximum for text-type responses.
Response - Radio
Create a radio response by selecting a response type of Radio during ontology creation. Radio responses support nested sub-classifications.
Response - Checklist
Create a checklist response by selecting a response type of Checklist during ontology creation. Checklist responses support nested sub-classifications.
Updated 30 days ago