> ## Documentation Index
> Fetch the complete documentation index at: https://docs.labelbox.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Import document data

> How to import document data and sample import formats.

## Specifications

Format: PDF Recommended size: 100 pages or fewer Import methods:

* IAM Delegated Access
* Signed URLs (`https` URLs only)

When importing document data to Labelbox, you are no longer required to provide an OCR extract in the form of a JSON file. Labelbox generates text layers automatically during PDF import using Google Document AI if the data row doesn't include a text layer. The JSON file generated will be your text layer, rendered on top of your PDF in the Document Editor.

<Info>
  ### Text layer limit

  Previously generated PDF documents without text layers can't be retroactively filled with the text layer generated by Labelbox.
</Info>

Google Document AI has the following limitations:

* The document must have no more than 15 pages
* The file size should not exceed 20 MB.

Additionally, Google Document AI optimizes documents before OCR processing. This optimization might include rotating images or pages to ensure text appears horizontally. Consequently, token coordinates are calculated based on the rotated/optimized images, resulting in potential discrepancies with the original PDF document.

For example, the document is rotated 90 degrees before processing in a landscape-oriented PDF. As a result, all tokens in the text layer are also rotated by 90 degrees.

<Warning>
  ### Image encoding

  If your PDF files contain images, use JPEG encoding and RGB colorspace for color images.
</Warning>

## Text Layer Validation Schema

If you want to upload your own text layer, the textLayer JSON file must adhere to the following JSON schema.

<CodeGroup>
  ```json Text Layer Validation Schema expandable theme={null}
  {
    "type": "array",
    "items": {
      "$ref": "#/$defs/page"
    },
    "$defs": {
      "page": {
        "type": "object",
        "properties": {
          "width": {
            "type": "number"
          },
          "height": {
            "type": "number"
          },
          "number": {
            "type": "number"
          },
          "units": {
            "enum": ["POINTS", "PERCENT"]
          },
          "groups": {
            "type": "array",
            "items": {
              "$ref": "#/$defs/group"
            }
          }
        },
        "required": ["number", "units", "groups"]
      },
      "group": {
        "type": "object",
        "properties": {
          "id": {
            "type": "string"
          },
          "content": {
            "type": "string"
          },
          "geometry": {
            "$ref": "#/$defs/geometry"
          },
          "tokens": {
            "type": "array",
            "items": {
              "$ref": "#/$defs/token"
            }
          }
        },
        "required": ["id", "content", "geometry", "tokens"]
      },
      "geometry": {
        "type": "object",
        "properties": {
          "left": {
            "type": "number"
          },
          "top": {
            "type": "number"
          },
          "width": {
            "type": "number"
          },
          "height": {
            "type": "number"
          }
        },
        "required": ["left", "top", "width", "height"]
      },
      "token": {
        "type": "object",
        "properties": {
          "id": {
            "type": "string"
          },
          "content": {
            "type": "string"
          },
          "geometry": {
            "$ref": "#/$defs/geometry"
          }
        },
        "required": ["id", "geometry", "content"]
      }
    }
  }
  ```

  ```json Document AI example expandable theme={null}
  [
      {
          "width": 1601,
          "height": 2498,
          "number": 1,
          "units": "PERCENT",
          "groups": [
              {
                  "id": "b4f4e1da-4088-44b3-a578-22f88ce9e166",
                  "content": "ΑΝ",
                  "geometry": {
                      "left": 0.4846970736980438,
                      "top": 0.17333866655826569,
                      "width": 0.028107434511184692,
                      "height": 0.007205769419670105
                  },
                  "tokens": [
                      {
                          "id": "0791adf8-80e9-4d3d-9d37-b3ad42dd061e",
                          "content": "ΑΝ",
                          "geometry": {
                              "left": 0.4846970736980438,
                              "top": 0.17333866655826569,
                              "width": 0.028107434511184692,
                              "height": 0.007205769419670105
                          }
                      }
                  ]
              },
              {
                  "id": "6f7c7e45-b9e8-4845-8c4e-63915e4a2e3d",
                  "content": "ESSAY",
                  "geometry": {
                      "left": 0.43410369753837585,
                      "top": 0.22377902269363403,
                      "width": 0.1274203360080719,
                      "height": 0.014811843633651733
                  },
                  "tokens": [
                      {
                          "id": "5adfb30a-235a-41c0-9902-a930cad660fc",
                          "content": "ESSAY",
                          "geometry": {
                              "left": 0.43410369753837585,
                              "top": 0.22377902269363403,
                              "width": 0.1274203360080719,
                              "height": 0.014811843633651733
                          }
                      }
                  ]
              },
              {
                  "id": "beec4d6e-cc9f-45e8-858c-c6c3f7c3ae49",
                  "content": "ON THE",
                  "geometry": {
                      "left": 0.46283572912216187,
                      "top": 0.2810248136520386,
                      "width": 0.07307934761047363,
                      "height": 0.006805449724197388
                  },
                  "tokens": [
                      {
                          "id": "a9456ab4-bfd0-49f2-a10b-c0ad2fa2fbb4",
                          "content": "ON",
                          "geometry": {
                              "left": 0.46283572912216187,
                              "top": 0.2810248136520386,
                              "width": 0.0237351655960083,
                              "height": 0.006405144929885864
                          }
                      },
                      {
                          "id": "fbdb497f-8241-406c-a06e-87027fb9b0b2",
                          "content": "THE",
                          "geometry": {
                              "left": 0.4971892535686493,
                              "top": 0.2810248136520386,
                              "width": 0.038725823163986206,
                              "height": 0.006405144929885864
                          }
                      }
                  ]
              },
              {
                  "id": "4caeda09-c92e-4c47-bb7a-3aea91a7dc29",
                  "content": "PRINCIPLE OF POPULATION,",
                  "geometry": {
                      "left": 0.2841973900794983,
                      "top": 0.31545236706733704,
                      "width": 0.425983726978302,
                      "height": 0.013210564851760864
                  },
                  "tokens": [
                      {
                          "id": "493ef02d-8c10-45cb-a889-10abc907a30c",
                          "content": "PRINCIPLE",
                          "geometry": {
                              "left": 0.2841973900794983,
                              "top": 0.31545236706733704,
                              "width": 0.15990003943443298,
                              "height": 0.013210564851760864
                          }
                      },
                      {
                          "id": "23ae78f7-9c04-43cb-b7a0-180806ec4472",
                          "content": "OF",
                          "geometry": {
                              "left": 0.4609619081020355,
                              "top": 0.31545236706733704,
                              "width": 0.036851972341537476,
                              "height": 0.013210564851760864
                          }
                      },
                      {
                          "id": "33462d2b-af22-456e-bd3b-5ed03ca091f3",
                          "content": "POPULATION",
                          "geometry": {
                              "left": 0.5115552544593811,
                              "top": 0.31545236706733704,
                              "width": 0.19300436973571777,
                              "height": 0.013210564851760864
                          }
                      },
                      {
                          "id": "f989c150-075b-4ea8-ac30-3cb8bd5697a7",
                          "content": ",",
                          "geometry": {
                              "left": 0.7033104300498962,
                              "top": 0.31545236706733704,
                              "width": 0.006870687007904053,
                              "height": 0.013210564851760864
                          }
                      }
                  ]
              },
              {
                  "id": "be9c8936-10b8-4013-8259-b34d309b1ce9",
                  "content": "AS IT AFFECTS",
                  "geometry": {
                      "left": 0.42910680174827576,
                      "top": 0.3570856750011444,
                      "width": 0.13991257548332214,
                      "height": 0.006805449724197388
                  },
                  "tokens": [
                      {
                          "id": "3af460fb-30cf-4988-82a3-4bdd6814d37e",
                          "content": "AS",
                          "geometry": {
                              "left": 0.42910680174827576,
                              "top": 0.3570856750011444,
                              "width": 0.02123674750328064,
                              "height": 0.006805449724197388
                          }
                      },
                      {
                          "id": "56b8b804-d802-48ca-af12-9fca0f16fe22",
                          "content": "IT",
                          "geometry": {
                              "left": 0.4603372812271118,
                              "top": 0.3570856750011444,
                              "width": 0.018113672733306885,
                              "height": 0.006805449724197388
                          }
                      },
                      {
                          "id": "29c69737-a6ff-48c6-b09d-fa6b04739c91",
                          "content": "AFFECTS",
                          "geometry": {
                              "left": 0.4890693426132202,
                              "top": 0.3570856750011444,
                              "width": 0.07995003461837769,
                              "height": 0.006805449724197388
                          }
                      }
                  ]
              },
              {
                  "id": "e0334ded-e13b-4b56-a53e-e8671eae6871",
                  "content": "THE FUTURE IMPROVEMENT OF SOCIETY.",
                  "geometry": {
                      "left": 0.21549031138420105,
                      "top": 0.3995196223258972,
                      "width": 0.5671455562114716,
                      "height": 0.010008007287979126
                  },
                  "tokens": [
                      {
                          "id": "fbf57954-5e06-4a78-8d62-f012d18683d7",
                          "content": "THE",
                          "geometry": {
                              "left": 0.21549031138420105,
                              "top": 0.3995196223258972,
                              "width": 0.05496564507484436,
                              "height": 0.010008007287979126
                          }
                      },
                      {
                          "id": "43be97d3-1179-4ea9-b07d-aa0e369f770f",
                          "content": "FUTURE",
                          "geometry": {
                              "left": 0.2841973900794983,
                              "top": 0.3995196223258972,
                              "width": 0.1043097972869873,
                              "height": 0.010008007287979126
                          }
                      },
                      {
                          "id": "57d3e552-8fae-466a-a090-5c3f1245c803",
                          "content": "IMPROVEMENT",
                          "geometry": {
                              "left": 0.4022485911846161,
                              "top": 0.3995196223258972,
                              "width": 0.20112428069114685,
                              "height": 0.010008007287979126
                          }
                      },
                      {
                          "id": "b2f2323b-6e5b-41ab-80c8-4c4dc3d7d1d5",
                          "content": "OF",
                          "geometry": {
                              "left": 0.6171143054962158,
                              "top": 0.3995196223258972,
                              "width": 0.031230449676513672,
                              "height": 0.010008007287979126
                          }
                      },
                      {
                          "id": "1284a561-2359-4658-9a23-1957ea9c34e0",
                          "content": "SOCIETY",
                          "geometry": {
                              "left": 0.6627107858657837,
                              "top": 0.3995196223258972,
                              "width": 0.11242973804473877,
                              "height": 0.010008007287979126
                          }
                      },
                      {
                          "id": "1e91b8d6-668f-4d25-bf30-75c4478ae33e",
                          "content": ".",
                          "geometry": {
                              "left": 0.7763897776603699,
                              "top": 0.3995196223258972,
                              "width": 0.006246089935302734,
                              "height": 0.010008007287979126
                          }
                      }
                  ]
              },
              {
                  "id": "027c0a96-c979-47de-8fca-44d0b07c3ac3",
                  "content": "WITH REMARKS",
                  "geometry": {
                      "left": 0.42410993576049805,
                      "top": 0.44515612721443176,
                      "width": 0.14990627765655518,
                      "height": 0.0064051151275634766
                  },
                  "tokens": [
                      {
                          "id": "60143ecf-ae4f-4635-a1e6-01fca29d286c",
                          "content": "WITH",
                          "geometry": {
                              "left": 0.42410993576049805,
                              "top": 0.44515612721443176,
                              "width": 0.05059337615966797,
                              "height": 0.0064051151275634766
                          }
                      },
                      {
                          "id": "daa24689-add8-4de3-9b58-36d6ea8b871f",
                          "content": "REMARKS",
                          "geometry": {
                              "left": 0.48657089471817017,
                              "top": 0.44515612721443176,
                              "width": 0.08744531869888306,
                              "height": 0.0064051151275634766
                          }
                      }
                  ]
              },
              {
                  "id": "6f9dac4a-4f7e-4409-bbaa-205817a45880",
                  "content": "ON THE SPECULATIONS OF MR. GODWIN,",
                  "geometry": {
                      "left": 0.2804497182369232,
                      "top": 0.47718173265457153,
                      "width": 0.43847593665122986,
                      "height": 0.011208981275558472
                  },
                  "tokens": [
                      {
                          "id": "b4828d24-46a6-4530-ae0f-dc6290f5d34d",
                          "content": "ON",
                          "geometry": {
                              "left": 0.2804497182369232,
                              "top": 0.47758206725120544,
                              "width": 0.027482837438583374,
                              "height": 0.009207367897033691
                          }
                      },
                      {
                          "id": "472a19ce-d3b9-4fd2-8d2f-645a1b1ce6da",
                          "content": "THE",
                          "geometry": {
                              "left": 0.31980013847351074,
                              "top": 0.47758206725120544,
                              "width": 0.042473435401916504,
                              "height": 0.009207367897033691
                          }
                      },
                      {
                          "id": "ce03974a-0d54-43b0-8ded-78f67732e070",
                          "content": "SPECULATIONS",
                          "geometry": {
                              "left": 0.37289193272590637,
                              "top": 0.47758206725120544,
                              "width": 0.15427860617637634,
                              "height": 0.010008007287979126
                          }
                      },
                      {
                          "id": "0d9584f4-c1cd-4acf-b8b1-2bb2a812f344",
                          "content": "OF",
                          "geometry": {
                              "left": 0.5377888679504395,
                              "top": 0.4783827066421509,
                              "width": 0.02560901641845703,
                              "height": 0.009207367897033691
                          }
                      },
                      {
                          "id": "6d44ba8a-f037-4762-83e7-864d3d91293b",
                          "content": "MR",
                          "geometry": {
                              "left": 0.5740162134170532,
                              "top": 0.4783827066421509,
                              "width": 0.031855106353759766,
                              "height": 0.009207367897033691
                          }
                      },
                      {
                          "id": "6aa31fc2-ca8f-49b8-b640-75d1ba9c14b5",
                          "content": ".",
                          "geometry": {
                              "left": 0.6071205735206604,
                              "top": 0.4787830412387848,
                              "width": 0.004372239112854004,
                              "height": 0.009207338094711304
                          }
                      },
                      {
                          "id": "90d91952-9233-47f2-aefd-ec4a446a51f0",
                          "content": "GODWIN",
                          "geometry": {
                              "left": 0.6233603954315186,
                              "top": 0.4783827066421509,
                              "width": 0.0855715274810791,
                              "height": 0.010008007287979126
                          }
                      },
                      {
                          "id": "a5719dea-8694-4ed6-97ad-eea86a92c0ea",
                          "content": ",",
                          "geometry": {
                              "left": 0.712054967880249,
                              "top": 0.4791833460330963,
                              "width": 0.006870687007904053,
                              "height": 0.009207367897033691
                          }
                      }
                  ]
              },
              {
                  "id": "8b5a88c6-c39e-4f8d-88f2-30eaa7c3f935",
                  "content": "M. CONDORCET,",
                  "geometry": {
                      "left": 0.41474080085754395,
                      "top": 0.5148118734359741,
                      "width": 0.1698938012123108,
                      "height": 0.01080864667892456
                  },
                  "tokens": [
                      {
                          "id": "63b99375-0adc-461b-9539-0d907eebac3b",
                          "content": "M.",
                          "geometry": {
                              "left": 0.41474080085754395,
                              "top": 0.5152121782302856,
                              "width": 0.02435976266860962,
                              "height": 0.009207367897033691
                          }
                      },
                      {
                          "id": "2a2f1ef8-6018-41f7-9b06-9f957313251b",
                          "content": "CONDORCET",
                          "geometry": {
                              "left": 0.4509681463241577,
                              "top": 0.5152121782302856,
                              "width": 0.12679576873779297,
                              "height": 0.010408341884613037
                          }
                      },
                      {
                          "id": "0623b165-4bb6-4ac7-b115-5989dab0d188",
                          "content": ",",
                          "geometry": {
                              "left": 0.5771392583847046,
                              "top": 0.516413152217865,
                              "width": 0.0074953436851501465,
                              "height": 0.008807003498077393
                          }
                      }
                  ]
              },
              {
                  "id": "4f5337fe-082a-4937-a08f-8c3304b84c7a",
                  "content": "AND OTHER WRITERS.",
                  "geometry": {
                      "left": 0.3791380524635315,
                      "top": 0.5536429286003113,
                      "width": 0.23860085010528564,
                      "height": 0.0072057247161865234
                  },
                  "tokens": [
                      {
                          "id": "ba603d0c-b146-4d01-9e1f-ad47ce600c29",
                          "content": "AND",
                          "geometry": {
                              "left": 0.3791380524635315,
                              "top": 0.5536429286003113,
                              "width": 0.04434725642204285,
                              "height": 0.0072057247161865234
                          }
                      },
                      {
                          "id": "3bdbfa39-0418-4758-a855-1104a8d8fb02",
                          "content": "OTHER",
                          "geometry": {
                              "left": 0.4347282946109772,
                              "top": 0.5536429286003113,
                              "width": 0.07245472073554993,
                              "height": 0.0072057247161865234
                          }
                      },
                      {
                          "id": "2a729df2-5199-45dd-bdf6-4787b95efe65",
                          "content": "WRITERS",
                          "geometry": {
                              "left": 0.5165521502494812,
                              "top": 0.5536429286003113,
                              "width": 0.09369146823883057,
                              "height": 0.0072057247161865234
                          }
                      },
                      {
                          "id": "40924cb4-fc4f-45a2-b4f7-1c31368f243b",
                          "content": ".",
                          "geometry": {
                              "left": 0.612742006778717,
                              "top": 0.5536429286003113,
                              "width": 0.004996895790100098,
                              "height": 0.0072057247161865234
                          }
                      }
                  ]
              },
              {
                  "id": "75a050e5-46d5-4773-9909-0bc2309892da",
                  "content": "LONDON:",
                  "geometry": {
                      "left": 0.45284196734428406,
                      "top": 0.6016813516616821,
                      "width": 0.09306684136390686,
                      "height": 0.00840669870376587
                  },
                  "tokens": [
                      {
                          "id": "cea98dbb-45d8-4562-8dee-a965cee86db7",
                          "content": "LONDON",
                          "geometry": {
                              "left": 0.45284196734428406,
                              "top": 0.6016813516616821,
                              "width": 0.08432230353355408,
                              "height": 0.00840669870376587
                          }
                      },
                      {
                          "id": "92b73d06-bae1-4a32-9970-e6cbc99ee74f",
                          "content": ":",
                          "geometry": {
                              "left": 0.5402873158454895,
                              "top": 0.6024819612503052,
                              "width": 0.005621492862701416,
                              "height": 0.007606089115142822
                          }
                      }
                  ]
              },
              {
                  "id": "c31fd2a2-f11e-4b8e-8b84-8592bd65b235",
                  "content": "PRINTED FOR J. JOHNSON, IN ST. PAUL'S",
                  "geometry": {
                      "left": 0.30605870485305786,
                      "top": 0.6381104588508606,
                      "width": 0.38538414239883423,
                      "height": 0.008406758308410645
                  },
                  "tokens": [
                      {
                          "id": "920b4d6d-150a-4e53-b380-315190fc3c96",
                          "content": "PRINTED",
                          "geometry": {
                              "left": 0.30605870485305786,
                              "top": 0.6381104588508606,
                              "width": 0.08119925856590271,
                              "height": 0.00960773229598999
                          }
                      },
                      {
                          "id": "356a78e0-ebd7-44a0-8f04-015d3fe3527b",
                          "content": "FOR",
                          "geometry": {
                              "left": 0.3978763222694397,
                              "top": 0.6377101540565491,
                              "width": 0.03560274839401245,
                              "height": 0.010008037090301514
                          }
                      },
                      {
                          "id": "b2ddf115-27c1-4078-b96c-a80464d54c2a",
                          "content": "J.",
                          "geometry": {
                              "left": 0.4409743845462799,
                              "top": 0.6377101540565491,
                              "width": 0.012492209672927856,
                              "height": 0.009607672691345215
                          }
                      },
                      {
                          "id": "d25272ac-5ae1-4510-9778-1f8c72613960",
                          "content": "JOHNSON",
                          "geometry": {
                              "left": 0.46283572912216187,
                              "top": 0.6377101540565491,
                              "width": 0.08432227373123169,
                              "height": 0.009607672691345215
                          }
                      },
                      {
                          "id": "79866cbf-139e-4b14-805b-6e8f8ea1c8cb",
                          "content": ",",
                          "geometry": {
                              "left": 0.5496564507484436,
                              "top": 0.6373098492622375,
                              "width": 0.004372298717498779,
                              "height": 0.009607672691345215
                          }
                      },
                      {
                          "id": "dcddbbd2-b954-48ff-91b2-a323310724a2",
                          "content": "IN",
                          "geometry": {
                              "left": 0.5652716755867004,
                              "top": 0.6373098492622375,
                              "width": 0.01873832941055298,
                              "height": 0.009607672691345215
                          }
                      },
                      {
                          "id": "46dbb8a6-dc00-4a7b-8703-aec10245d262",
                          "content": "ST",
                          "geometry": {
                              "left": 0.5946283340454102,
                              "top": 0.6373098492622375,
                              "width": 0.019362926483154297,
                              "height": 0.009607672691345215
                          }
                      },
                      {
                          "id": "e6ef7add-c8c1-4f18-8bfc-55cc3a810449",
                          "content": ".",
                          "geometry": {
                              "left": 0.6164897084236145,
                              "top": 0.6373098492622375,
                              "width": 0.004372239112854004,
                              "height": 0.009607672691345215
                          }
                      },
                      {
                          "id": "4a743990-3683-4017-9c62-4e6cae7513a2",
                          "content": "PAUL'S",
                          "geometry": {
                              "left": 0.6308557391166687,
                              "top": 0.636909544467926,
                              "width": 0.06058710813522339,
                              "height": 0.009607672691345215
                          }
                      }
                  ]
              },
              {
                  "id": "47f38d95-c06c-47e3-8c4e-e7db108ad5c8",
                  "content": "CHURCH-YARD",
                  "geometry": {
                      "left": 0.4303560256958008,
                      "top": 0.6717373728752136,
                      "width": 0.13616490364074707,
                      "height": 0.006004810333251953
                  },
                  "tokens": [
                      {
                          "id": "9805d04e-d122-48cc-8ed3-ea49ec5fd231",
                          "content": "CHURCH",
                          "geometry": {
                              "left": 0.4303560256958008,
                              "top": 0.6717373728752136,
                              "width": 0.076202392578125,
                              "height": 0.006004810333251953
                          }
                      },
                      {
                          "id": "6949f56c-30db-484c-99ad-8c3f15d4c7f2",
                          "content": "-",
                          "geometry": {
                              "left": 0.5103060603141785,
                              "top": 0.6717373728752136,
                              "width": 0.008119940757751465,
                              "height": 0.006004810333251953
                          }
                      },
                      {
                          "id": "b17eee41-1697-4065-ae24-757f35a79178",
                          "content": "YARD",
                          "geometry": {
                              "left": 0.5190505981445312,
                              "top": 0.6717373728752136,
                              "width": 0.0474703311920166,
                              "height": 0.006004810333251953
                          }
                      }
                  ]
              },
              {
                  "id": "87f29527-2a25-4ec9-b946-e2b3005758c2",
                  "content": "1798.",
                  "geometry": {
                      "left": 0.46221113204956055,
                      "top": 0.7233787178993225,
                      "width": 0.07432854175567627,
                      "height": 0.014011204242706299
                  },
                  "tokens": [
                      {
                          "id": "e6ee28b1-65a5-4926-938d-6548045bb9b2",
                          "content": "1798",
                          "geometry": {
                              "left": 0.46221113204956055,
                              "top": 0.723779022693634,
                              "width": 0.06433475017547607,
                              "height": 0.013610899448394775
                          }
                      },
                      {
                          "id": "e575c04a-33d5-4332-bfb0-9f208632108b",
                          "content": ".",
                          "geometry": {
                              "left": 0.5302935838699341,
                              "top": 0.723779022693634,
                              "width": 0.006246089935302734,
                              "height": 0.013210594654083252
                          }
                      }
                  ]
              }
          ]
      }
  ]
  ```
</CodeGroup>

## Parameters

| Parameter                    | Required | Description                                                                                                                                                                                                                                    |
| ---------------------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `row_data`                   | Yes      | A dictionary of`{ "pdf_url": str, "text_layer_url": str }` For IAM Delegated Access, this URL must be in [virtual-hosted-style format](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#virtual-hosted-style-access). |
| `row_data['pdf_url']`        | Yes      | `https` path to a cloud-hosted PDF. It must be specified within `row_data` dictionary.                                                                                                                                                         |
| `row_data['text_layer_url']` | No       | `https` path to a cloud-hosted JSON extract of the PDF.                                                                                                                                                                                        |
| `global_key`                 | No       | Unique user-generated file name or ID for the file. [Global keys](/reference/data-row-global-keys) are enforced to be unique in your org. Data rows will not be imported if its global keys are duplicated to existing data rows.              |
| `media_type`                 | No       | `"PDF"` (optional media type to provide better validation and error messaging)                                                                                                                                                                 |
| `metadata_fields`            | No       | See [Metadata](/docs/datarow-metadata).                                                                                                                                                                                                        |
| `attachments`                | No       | See [Attachments](/docs/label-data) and [Asset overlays](/docs/label-data).                                                                                                                                                                    |

## Import format

<CodeGroup>
  ```json Delegated Access URL theme={null}
  [
    {
      "row_data": {
        "pdf_url": "https://lb-test-data.s3.us-west-1.amazonaws.com/document-samples/0801.3483.pdf",
        // You don't need to provide a text_layer_url. Labelbox automatically generates a text layer when importing an asset without one.
        "text_layer_url": "https://lb-test-data.s3.us-west-1.amazonaws.com/document-samples/0801.3483-lb-textlayer.json"
      },
      "global_key": "https://lb-test-data.s3.us-west-1.amazonaws.com/document-samples/0801.3483.pdf",
      "media_type": "PDF",
      "metadata_fields": [{"schema_id": "cko8s9r5v0001h2dk9elqdidh", "value": "tag_string"}],
      "attachments": [{"type": "HTML", "value": "https://www.wikipedia.org/" }]
    },
    {
      "row_data": {
        "pdf_url": "https://lb-test-data.s3.us-west-1.amazonaws.com/document-samples/0803.1972.pdf",
         // You don't need to provide a text_layer_url. Labelbox automatically generates a text layer when importing an asset without one.
        "text_layer_url": "https://lb-test-data.s3.us-west-1.amazonaws.com/document-samples/0803.1972-lb-textlayer.json"
      },
      "global_key": "https://lb-test-data.s3.us-west-1.amazonaws.com/document-samples/0803.1972.pdf",
      "media_type": "PDF",
      "metadata_fields": [{"name": "<metadata_field_name>", "value": "tag_string"}],
      "attachments": [{"type": "TEXT_URL", "value": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/text_attachment.txt"}]
    }
  ]
  ```

  ```json Standard URL theme={null}
  [
    {
      "row_data": {
        "pdf_url": "https://storage.googleapis.com/labelbox-datasets/arxiv-pdf/data/99-word-token-pdfs/0801.3483.pdf",
        // You don't need to provide a text_layer_url. Labelbox automatically generates a text layer when importing an asset without one.
        "text_layer_url": "https://storage.googleapis.com/labelbox-datasets/arxiv-pdf/data/99-word-token-pdfs/0801.3483-lb-textlayer.json"
      },
      "global_key": "https://storage.googleapis.com/labelbox-datasets/arxiv-pdf/data/99-word-token-pdfs/0801.3483.pdf",
      "media_type": "PDF",
      "metadata_fields": [{"schema_id": "cko8s9r5v0001h2dk9elqdidh", "value": "tag_string"}],
      "attachments": [{"type": "HTML", "value": "https://www.wikipedia.org/" }]
    }
  ]
  ```
</CodeGroup>

## Python example

<CodeGroup>
  ```python import example with custom text layer theme={null}
  from labelbox import Client
  from uuid import uuid4 ## to generate unique IDs
  import datetime

  client = Client(api_key="<YOUR_API_KEY>")

  dataset = client.create_dataset(name="Bulk import example - Documents")

  assets = [
    {
      "row_data": {
        "pdf_url": "https://storage.googleapis.com/labelbox-datasets/arxiv-pdf/data/99-word-token-pdfs/0801.3483.pdf",
  		# You don't need to provide a text_layer_url. Labelbox automatically generates a text layer when importing an asset without one.
        "text_layer_url": "https://storage.googleapis.com/labelbox-datasets/arxiv-pdf/data/99-word-token-pdfs/0801.3483-lb-textlayer.json"
      },
      "global_key": "https://storage.googleapis.com/labelbox-datasets/arxiv-pdf/data/99-word-token-pdfs/0801.3483.pdf",
      "media_type": "PDF",
      "metadata_fields": [{"name": "<metadata_field_name>", "value": "tag_string"}],
      "attachments": [{"type": "HTML", "value": "https://www.wikipedia.org/" }]
    }
  ]

  task = dataset.create_data_rows(assets)
  task.wait_till_done()
  print(task.errors)
  ```

  ```python local files theme={null}
  local_file_paths = ['path/to/local/file1', 'path/to/local/file1'] # limit: 15k files

  new_dataset = client.create_dataset(name = "Local files upload")

  try:
      task = new_dataset.create_data_rows(local_file_paths)
      task.wait_till_done()
  except Exception as err:
      print(f'Error while creating labelbox dataset -  Error: {err}')
  ```
</CodeGroup>

### Verify files are processed

<Warning>
  ### File processing can take up to 20 mins

  Since PDFs and OCR'ed files can be very large, the conversion can sometimes take up to 20 minutes to perform a data upload.
</Warning>

By checking the Media Attributes section, you can verify whether a file conversion using a custom or Labelbox-generated text layer is complete.

* If `Is text layer valid = true`, the file was successfully processed.

<Frame>
  <img src="https://mintcdn.com/labelbox-1db23ff4/kk7T7t8LC7-TvqwP/images/reference/90d4f73-Screen_Shot_2024-03-14_at_11.49.31_AM.png?fit=max&auto=format&n=kk7T7t8LC7-TvqwP&q=85&s=5fc1270dd7e881633e6c7142bcd8c12a" alt="" width="704" height="468" data-path="images/reference/90d4f73-Screen_Shot_2024-03-14_at_11.49.31_AM.png" />
</Frame>
