Import prompt and responses via Model-assisted labeling or Ground truth import.
Open this Colab for an interactive tutorial on importing prompt and response data for the LLM data generation editor.
Supported annotations
Prompt
Classification: Free-form text
prompt_annotation_ndjson = {
"name": "Follow the prompt and select answers",
"answer": "This is an example of a prompt"
}
Python annotations not yet supported
Responses
Classification: Radio
response_radio_annotation_ndjson= {
"name": "response_radio",
"answer": {
"name": "response_a"
}
}
Python annotations not yet supported
Classification: Free-form text
response_text_annotation_ndjson = {
"name": "Provide a reason for your choice",
"answer": "This is an example of a response text"
}
Python annotations not yet supported
Classification: Checklist
response_checklist_annotation_ndjson = {
"name": "response_checklist",
"answer": [
{
"name": "response_a"
},
{
"name": "response_c"
}
]
}
Python annotations not yet supported
End-to-end example: Import prompt & responses
No SDK support
Creating a project and creating an ontology for LLM data generation is not yet supported through the SDK. Follow the steps below to create a project and ontology via the UI.
Before you start
- Go to Annotate and select New project.
- Select LLM data generation and then select Humans generate prompts and responses.
- Name your project, select Create a new dataset, and name your dataset (data rows will be generated automatically in this step).
Step 1: Specify project ID and global keys
# Enter the project id
project_id = ""
# Select one of the global keys from the data rows generated
global_key = ""
Step 2: Create/select an ontology in the UI
- In your project, navigate to Settings > Label editor.
- Select Edit.
- Select a new ontology and add the features used in this demo.
// For this demo, the following ontology was generated in the UI.
ontology_json = """
{
"tools": [],
"relationships": [],
"classifications": [
{
"schemaNodeId": "clpvq9d0002yt07zy0khq42rp",
"featureSchemaId": "clpvq9d0002ys07zyf2eo9p14",
"type": "prompt",
"name": "Follow the prompt and select answers",
"archived": false,
"required": true,
"options": [],
"instructions": "Follow the prompt and select answers",
"minCharacters": 5,
"maxCharacters": 100
},
{
"schemaNodeId": "clpvq9d0002yz07zy0fjg28z7",
"featureSchemaId": "clpvq9d0002yu07zy28ik5w3i",
"type": "response-radio",
"name": "response_radio",
"instructions": "response_radio",
"scope": "global",
"required": true,
"archived": false,
"options": [
{
"schemaNodeId": "clpvq9d0002yw07zyci2q5adq",
"featureSchemaId": "clpvq9d0002yv07zyevmz1yoj",
"value": "response_a",
"label": "response_a",
"position": 0,
"options": []
},
{
"schemaNodeId": "clpvq9d0002yy07zy8pe48zdj",
"featureSchemaId": "clpvq9d0002yx07zy0jvmdxk8",
"value": "response_b",
"label": "response_b",
"position": 1,
"options": []
}
]
},
{
"schemaNodeId": "clpvq9d0002z107zygf8l62ys",
"featureSchemaId": "clpvq9d0002z007zyg26115f9",
"type": "response-text",
"name": "provide_a_reason_for_your_choice",
"instructions": "Provide a reason for your choice",
"scope": "global",
"required": true,
"archived": false,
"options": [],
"minCharacters": 5,
"maxCharacters": 100
},
{
"schemaNodeId": "clpvq9d0102z907zy8b10hjcj",
"featureSchemaId": "clpvq9d0002z207zy6xla7f82",
"type": "response-checklist",
"name": "response_checklist",
"instructions": "response_checklist",
"scope": "global",
"required": true,
"archived": false,
"options": [
{
"schemaNodeId": "clpvq9d0102z407zy0adq0rfr",
"featureSchemaId": "clpvq9d0002z307zy6dqb8xsw",
"value": "response_a",
"label": "response_a",
"position": 0,
"options": []
},
{
"schemaNodeId": "clpvq9d0102z607zych8b2z5d",
"featureSchemaId": "clpvq9d0102z507zyfwfgacrn",
"value": "response_c",
"label": "response_c",
"position": 1,
"options": []
},
{
"schemaNodeId": "clpvq9d0102z807zy03y7gysp",
"featureSchemaId": "clpvq9d0102z707zyh61y5o3u",
"value": "response_d",
"label": "response_d",
"position": 2,
"options": []
}
]
}
],
"realTime": false
}
"""
Step 3: Create the annotations payload
label_ndjson = []
for annotations in [
prompt_annotation_ndjson,
response_radio_annotation_ndjson,
response_text_annotation_ndjson,
response_checklist_annotation_ndjson
]:
annotations.update({
"dataRow": {
"globalKey": global_key
}
})
label_ndjson.append(annotations)
Python annotations not yet supported
Step 4: Upload prompt and responses as pre-labels or complete labels
project = client.get_project(project_id=project_id)
Import as pre-labels via Model-assisted labeling
upload_job = lb.MALPredictionImport.create_from_objects(
client = client,
project_id = project.uid,
name=f"mal_job-{str(uuid.uuid4())}",
predictions=label_ndjson)
upload_job.wait_until_done()
print("Errors:", upload_job.errors)
print("Status of uploads: ", upload_job.statuses)
Import as Ground truth
upload_job = lb.LabelImport.create_from_objects(
client = client,
project_id = project.uid,
name="label_import_job"+str(uuid.uuid4()),
labels=label_ndjson)
upload_job.wait_until_done();
print("Errors:", upload_job.errors)
print("Status of uploads: ", upload_job.statuses)