πŸ“˜

Supported text file formats and size

Format: https path to a cloud-hosted .txt file
Currently, Labelbox only supports text files encoded as UTF-8. We do not process any special character sequences like HTML Entities, Unicode Escape Sequence or colon emoji aliases.

Labelbox accepts files up to 256 MB, all import methods (upload, IAM DA, signed URLs)

❗️

Raw text uploads

Labelbox has deprecated our raw text uploads. You must provide links to .txt files by using delegated access or providing an authenticated URL to the file.

When importing text data to Labelbox, your JSON file must include the following information for each text file.

Parameter

Required

Description

externalId

Yes

User-generated file name or ID for the file. For the best experience, this ID should be unique.

data

Yes

Accepts an https path to an external text file (emojis supported for cloud-hosted txt files). For IAM Delegated Access, this URL must be in virtual-hosted-style format. For older regions, your S3 bucket may be in https://<bucket-name>.s3-<region>.amazonaws.com/<key> format. If your object URLs are formatted this way, make sure they are in virtual-hosted-style format before importing.

The txt file must be encoded as UTF-8

attachments

No

Attachments

[
    {
        "externalId": "lorem-ipsum.txt",
        "data": "https://storage.googleapis.com/labelbox-sample-datasets/nlp/lorem-ipsum.txt"
    }
]
[
    {
        "externalId": "plaintext+test.txt",
        "data": "https://lb-test-data.s3.us-west-1.amazonaws.com/plaintext+test.txt"
    }
]