Text editor

When you attach a text dataset to a project, the Labelbox Editor will automatically adjust the Editor interface for text labeling.

Supported annotation types

Below are all of the annotation types you may include in your ontology when you are labeling text data. Classification-type annotations can be applied at the global level and/or nested within an Object-type annotation.

Annotation type

Import

Export

Entity (NER)

See sample

See sample

Annotation relationships

See sample

Radio classification

See sample

See sample

Checklist classification

See sample

See sample

Free-form text classification

See sample

See sample

Dropdown classification

See sample

How it works

Natural Language Processing (NLP) is an area of research and application that explores how to use computers to “understand” and manipulate natural language, such as text or speech. Most NLP techniques rely on machine learning to derive meaning from human languages. One of NLP’s methodologies for processing natural language is text classification, a method that leverages deep learning to categorize sequences of unstructured text.

Named Entity Recognition (NER) is a subtask of information extraction whereby entities in the unstructured text are classified into pre-determined categories. You can use the Labelbox Editor to create NER training data for your ML model by labeling sequences of characters in your text file with the Entity annotation.

When you load a text file into the Editor, you can use the Entity annotation to label sequences of characters in the unstructured text. The characters in your text file are not restricted to a single Entity annotation, meaning Entity annotations can overlap.

When you export your NER annotations from Labelbox, each annotation in the export contains location.start and a location.end information to indicate which characters in the unstructured text are included in each Entity annotation. See the Data model reference for an example.

  • The value for location.start indicates the index of the first character in the Entity annotation and it assumes start-index inclusion.

  • The value for location.end indicates the index of the last character in the Entity annotation and assumes end-index exclusion.

You can also nest classification-type annotations within an Entity annotation. Nested classifications for text are supported for all classification types (see section above).


What’s Next

See the JSON format for importing text data to Labelbox.

Did this page help you?