Text entity can be applied at the global level and can be nested within an object-type annotation.
Import
Python SDK
TextEntity
is a type of ObjectAnnotation
Definition
TextEntity(start=start_location,end=end_location)
Parameter | Value |
---|---|
start | Start location of the character in the text. The minimum value for start is 0. |
end | End location of the character in the text. The maximum value for end is the number of characters in the text minus 1. |
Boundaries
In the text "Label data quickly and accurately with a configurable workflow and automated labeling.", the index of the first character ("L") is 0 and the index of the final period is 85. The text itself is 86 characters long.
If "end" exceeds the number of characters of the text, then the annotation will not be displayed in the editor.
Supported data types
Data type | Supported |
---|---|
Text, Document | Yes |
named_enity = TextEntity(start=10,end=20)
named_enity_annotation = ObjectAnnotation(value=named_enity, name="named_entity")
NDJSON
NDJSON format is recommended if an annotation type is not yet supported in Python SDK or if you are unable to use the Python environment.
Definition
Parameter | Asset type | Required | Description |
---|---|---|---|
uuid | Text, PDF | Yes | A user-generated UUID for each annotation. If you import an annotation to a Data Row and there is already an imported annotation with the same uuid on that Data Row, the latest import will override the previous one. The uuid must be 128 bits (32 characters). The following formats are supported: - A0EEBC99-9C0B-4EF8-BB6D-6BB9BD380A11 - {a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11} - a0eebc999c0b4ef8bb6d6bb9bd380a11 - a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11 - {a0eebc99-9c0b4ef8-bb6d6bb9-bd380a11} |
schemaId | Text, PDF | Yes | The ID of the schema that contains all of the information needed for rendering your annotation. |
dataRow.id | Text, PDF | Yes | The ID of the Data Row where you want to attach the imported annotations. |
location.start | Text | Yes | The index of the first character in the Entity annotation. Assumes start-index inclusion. |
location.end | Text | Yes | The index of the last character in the Entity annotation. Assumes end-index exclusion (character 128 in the example below would be excluded from the Entity annotation). |
Supported data types
Data type | Supported |
---|---|
Text, Document | Yes |
Format
{
"uuid": "9fd9a92e-2560-4e77-81d4-b2e955800092",
"schemaId": "ck8kukafkqx1a0880iczbrqym",
"dataRow": {
"id": "ck1s02fqxm8fi0757f0e6qtdc"
},
"location": {
"start": 67,
"end": 128
}
}
Export
When you export your Entity annotations from Labelbox, the export file will contain the following information for each Entity.
Python SDK
The export format of the polygon is similar to the import format.
Learn more about exporting annotations using the SDK here
JSON
You will receive a JSON file when you generate an export from the app.
Parameter | Asset type | Description |
---|---|---|
schemaId | Text, Document | The ID of the schema containing all of the information needed for rendering your annotation. |
featureId | Text, Document | ID of the annotation. |
title | Text, Document | Name of the annotation in the ontology. |
color | Text, Document | Color of the annotation in the ontology. |
version | Text | Export format version. |
format | Text | Export format specification. |
data.location.start | Text | The index of the first character in the Entity annotation. Assumes start-index inclusion. |
data.location.end | Text | The index of the last character in the Entity annotation. Assumes end-index exclusion. |
bbox.top | Document | y-coordinate of Bounding box top-left corner. |
bbox.left | Document | x-coordinate of Bounding box top-left corner. |
bbox.height | Document | Height of Bounding box in pixels. |
bbox.width | Document | Width of Bounding box in pixels. |
page | Document | Page of the document containing the annotation. |
instanceURI | Document | Annotation information hosted on Labelbox servers. |
Format
{
"featureId": "ck8kulppv000x0yf8pqpqqin4",
"schemaId": "ck8kukafkqx1a0880iczbrqym",
"title": "Entity type A",
"value": "entity_type_a",
"color": "#8000FF",
"version": 1,
"format": "text.location",
"data": {
"location": {
"start": 67,
"end": 128
}
}
}
objects": [
{
"featureId": "cl899p4jn0amu3b6ucjoamoyi",
"schemaId": "cl899obmq004qt2mfay3673pu",
"color": "#1CE6FF",
"title": "Text",
"value": "text",
"data": {
"customLayerMetadata": {
"text": "evaluated by difference",
"textLayerUrl": "https://storage.googleapis.com/labelbox-developer-testing-assets/pdf/pdf-custom-text-layer/E2G5F105-lb-custom-text-layer.json",
"textSelections": [
{
"groupId": "9cec3650-ed4f-4261-adf0-283a5f4ea736",
"page": 0,
"text": "evaluated by difference",
"tokenIds": [
"16e7328e-bc52-41a6-9530-6700d6d6809a",
"0c01d928-0c89-4208-92e6-e306c5d85756",
"71c89c58-f60a-4f0e-8696-87e6e350bade",
"554b16cc-ee6d-4e2a-b830-59829e0c892f",
"908b5d7f-71d7-464d-8a93-b1c7164c657c",
"107e8b38-5ae7-4d8e-b75f-7d7213282e8b",
"a68db17f-81d3-48f6-b6c8-acba25871445",
"ad98bf61-4b77-405e-9014-ced4e0fd346d",
"7382f859-ed2d-4cc7-b17e-ffac0a4ac05f",
"92e1fe01-1970-42bb-8ed5-4e48e2395ca0",
"42503f3a-4107-4a4a-b522-cfd88de0963f",
"fe71c1b6-9402-4737-b0cb-3c8853d1364b",
"84ead276-f302-4039-900b-7ca81fc0c5c4",
"da009cd4-47e4-417f-94d1-b51e816dc72d",
"825316bd-6443-4dff-9002-3fdf92d6a8e6",
"fa81eab7-9f06-48c6-b6db-4a432d633b37",
"be7b0345-0d96-47b1-bb43-a7fa1365c31d"
]
}
]
}
}
}
]
{
"featureId":"cl2zji82m00043g6cunt8telp",
"schemaId":"cl2zjh37t0jrd08965lr80xva",
"color":"#1CE6FF",
"title":"Person",
"value":"person",
"version":1,
"format":"text.location",
"data":{
"location":{
"messageId":"2",
"start":0,
"end":22
}
}
}