📘

Supported text file formats and import methods

File format: TXT
Text encoding: UTF-8 (Note: The Editor does not process any special character sequences like HTML Entities, Unicode Escape Sequence, or colon emoji aliases.)

Import methods:

  • Direct upload (256 MB max file size)
  • IAM Delegated Access
  • Signed URLs (https URLs only)

❗️

Raw text uploads

Labelbox has deprecated our raw text uploads. You must provide links to .txt files by using delegated access or providing an authenticated URL to the file.

When importing text data to Labelbox, your JSON file must include the following information for each text file.

Parameter

Required

Description

externalId

Yes

User-generated file name or ID for the file. For the best experience, this ID should be unique.

data

Yes

Accepts an https path to an external text file (emojis supported for cloud-hosted txt files). For IAM Delegated Access, this URL must be in virtual-hosted-style format. For older regions, your S3 bucket may be in https://<bucket-name>.s3-<region>.amazonaws.com/<key> format. If your object URLs are formatted this way, make sure they are in virtual-hosted-style format before importing.

The txt file must be encoded as UTF-8

attachments

No

Attachments

[
    {
        "externalId": "lorem-ipsum.txt",
        "data": "https://storage.googleapis.com/labelbox-sample-datasets/nlp/lorem-ipsum.txt"
    }
]
[
    {
        "externalId": "plaintext+test.txt",
        "data": "https://lb-test-data.s3.us-west-1.amazonaws.com/plaintext+test.txt"
    }
]