> ## Documentation Index
> Fetch the complete documentation index at: https://docs.labelbox.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Import HTML data

> How to import HTML data and sample import formats.

<Info>
  ### Supported file formats and import methods

  Format: HTML Import methods:

  * IAM Delegated Access
  * Signed URLs (`https` URLs only)
</Info>

## Parameters

Import methods:

* IAM Delegated Access
* Signed URLs (`https` URLs only)

| Parameter         | Required | Description                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| ----------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `row_data`        | Yes      | `https` path to an HTML file. For IAM Delegated Access, this URL must be in [virtual-hosted-style format](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#virtual-hosted-style-access). For older regions, your S3 bucket may be in the`https://<bucket-name>.s3.<region>.amazonaws.com/<key>` format. If your object URLs are formatted this way, make sure they are in the virtual-hosted-style format before importing. |
| `global_key`      | No       | Unique user-generated file name or ID for the file. [Global keys](/reference/data-row-global-keys) are enforced to be unique in your org. Data rows will not be imported if its global keys are duplicated to existing data rows.                                                                                                                                                                                                                    |
| `media_type`      | No       | `"HTML"` (optional media type to provide better validation and error messaging)                                                                                                                                                                                                                                                                                                                                                                      |
| `metadata_fields` | No       | See [Metadata](/docs/datarow-metadata).                                                                                                                                                                                                                                                                                                                                                                                                              |
| `attachments`     | No       | See [Attachments](/docs/label-data) and [Asset overlays](/docs/label-data),                                                                                                                                                                                                                                                                                                                                                                          |

## Import format

<CodeGroup>
  ```json Delegated Access URL theme={null}
  [
    {
      "row_data": "https://lb-test-data.s3.us-west-1.amazonaws.com/sample_html_1.html",
      "global_key": "https://lb-test-data.s3.us-west-1.amazonaws.com/sample_html_1.html",
      "metadata_fields": [{"name": "<metadata_field_name>", "value": "tag_string"}],
      "attachments": [{"type": "HTML", "value": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/windy.html" }]
    },
    {
      "row_data": "https://lb-test-data.s3.us-west-1.amazonaws.com/sample_html_2.html",
      "global_key": "https://lb-test-data.s3.us-west-1.amazonaws.com/sample_html_2.html",
      "metadata_fields": [{"name": "<metadata_field_name>", "value": "tag_string"}],
      "attachments": [{"type": "TEXT_URL", "value": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/text_attachment.txt"}]
    }
  ]
  ```

  ```json Standard URL theme={null}
  [
    {
      "row_data": "https://storage.googleapis.com/labelbox-datasets/html_sample_data/sample_html_1.html",
      "global_key": "https://storage.googleapis.com/labelbox-datasets/html_sample_data/sample_html_1.html",
      "metadata_fields": [{"schema_id": "cko8s9r5v0001h2dk9elqdidh", "value": "tag_string"}],
      "attachments": [{"type": "HTML", "value": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/windy.html" }]
    },
    {
      "row_data": "https://storage.googleapis.com/labelbox-datasets/html_sample_data/sample_html_2.html",
      "global_key": "https://storage.googleapis.com/labelbox-datasets/html_sample_data/sample_html_2.html",
      "metadata_fields": [{"schema_id": "cko8s9r5v0001h2dk9elqdidh", "value": "tag_string"}],
      "attachments": [{"type": "TEXT_URL", "value": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/text_attachment.txt"}]
    }
  ]
  ```

  ```html HTML example theme={null}
  <html>
    <head>
      <title>HTML File Example</title>
    </head>
  <body bgcolor="ffffff">

  <center><img src="https://labelbox.com/static/images/logo-v4.svg" align="bottom">

  <hr>

  <h1>Get to production AI faster</h1>

  <p>Save time by creating and managing your training data, people, and processes in a single place — so you can focus on building the next big thing.</p>

  <p><a href="https://labelbox.com/sales">Get a demo</a> or <a href="https://app.labelbox.com/">start for free.</a>
  </center>
  <hr>

  </body>

  </html>
  ```
</CodeGroup>

## Python example

<CodeGroup>
  ```python bulk import example theme={null}
  from labelbox import Client
  from uuid import uuid4 ## to generate unique IDs
  import datetime

  client = Client(api_key="<YOUR_API_KEY>")

  dataset = client.create_dataset(name="Bulk import example - HTML")

  assets = [
    {
      "row_data": "https://storage.googleapis.com/labelbox-datasets/html_sample_data/sample_html_1.html",
      "global_key": "https://storage.googleapis.com/labelbox-datasets/html_sample_data/sample_html_1.html",
      "metadata_fields": [{"name": "<metadata_field_name>", "value": "tag_string"}],
      "attachments": [{"type": "HTML", "value": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/windy.html" }]
    },
    {
      "row_data": "https://storage.googleapis.com/labelbox-datasets/html_sample_data/sample_html_2.html",
      "global_key": "https://storage.googleapis.com/labelbox-datasets/html_sample_data/sample_html_2.html",
      "metadata_fields": [{"name": "<metadata_field_name>", "value": "tag_string"}],
      "attachments": [{"type": "TEXT_URL", "value": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/text_attachment.txt"}]
    }
  ]

  task = dataset.create_data_rows(assets)
  task.wait_till_done()
  print(task.errors)
  ```

  ```python local files theme={null}
  local_file_paths = ['path/to/local/file1', 'path/to/local/file1'] # limit: 15k files

  new_dataset = client.create_dataset(name = "Local files upload")

  try:
      task = new_dataset.create_data_rows(local_file_paths)
      task.wait_till_done()
  except Exception as err:
      print(f'Error while creating labelbox dataset -  Error: {err}')
  ```
</CodeGroup>
