Prompt and response generation

Learn how to create fine-tuning datasets of prompts and responses for LLMs.

You can prepare a dataset of prompts and responses to fine-tune large language models (LLMs). Labelbox supports dataset creation for a variety of fine-tuning tasks including summarization, classification, question-answering, and generation.

Fine-tuning is useful when an LLM needs learn something specific outside of the data it was trained on. In this case, the model is being fine-tuned to extract product SKUs from reviews.

Fine-tuning is useful when an LLM needs to learn something specific outside of the data it was trained on. In this case, the model is being fine-tuned to extract product SKUs from reviews.

Set up prompt and response generation

When you set up a prompt and response generation project, you will be prompted to specify how you will use the editor with the following three options:

  • Humans generate prompts and responses: In the editor, the prompt and response fields will be required. This will indicate to your team they should create a prompt and a response from scratch.
  • Humans generate prompts: In the editor, only the prompt field will be required. This will indicate to your team that they should create a prompt from scratch.
  • Humans generate responses to uploaded prompts: In the editor, a previously uploaded prompt will appear. Your team will need to create responses for that prompt.

📘

Benchmark and consensus support

Benchmarks and Consensus are only available for the Humans generate responses to uploaded prompts option.

Specify prompts and/or responses

When setting up the project,

  • If you select Humans generate prompts, you will need to specify a prompt for your labelers to reference so they can generate more prompts. Prompts are restricted to free-form text format. You can optionally set a character minimum and maximum for prompt data.

  • If you select Humans generate prompts and responses, you will need to specify a prompt and a response for your labelers to reference so they can generate more prompts and responses. See Supported prompt formats and Supported response format for supported format types.

🚧

Markdown editor size limit

When using the Markdown editor to specify a prompt or response, limit the character count to fewer than 6,000 characters.

Import prompts

During the project setup, if you select Humans generate responses to uploaded prompts as your LLM data generation workflow, you will need to create an import file containing links to a set of prompts in text format. Then, upload your import file and send the prompts to the project.

Follow these steps to upload prompt data to Labelbox:

  1. Create the import file containing links to the prompts. See Import text data to learn how to structure your import file.
  2. Upload the import file to Labelbox.
  3. Save the prompts in a batch and send the batch to your project. See Batches for instructions.

📘

Data row size limit

To view the maximum size allowed for a data row, visit our limits page.

Supported prompt formats

If you selected an LLM data generation workflow that involves generating a prompt in the editor, you will need to specify a prompt to use as the ontology. Each LLM data generation ontology is limited to one prompt.

FeatureImport formatExport format
Prompt - Free-form textN/ASee payload

Supported response formats

If you selected an LLM data generation workflow that involves generating a response in the editor, you will need to specify a set of responses to use as the ontology. Below are the supported formats you may include when you are specifying responses in your ontology. Responses can be applied at the global level and/or nested within other responses. LLM data generation ontologies support multiple responses.

FeatureImport formatExport format
Response - TextN/ASee payload
Response - RadioN/ASee payload
Response - ChecklistN/ASee payload

Response - Text

Create a text response by selecting a response type of Text during ontology creation. You can optionally set a character minimum and maximum for text-type responses.

Response - Radio

Create a radio response by selecting a response type of Radio during ontology creation. Radio responses support nested sub-classifications.

Response - Checklist

Create a checklist response by selecting a response type of Checklist during ontology creation. Checklist responses support nested sub-classifications.