HTML

Guide for labeling HTML data.

📘

For HTML, use Project data type of text

When configuring a project to HTML assets, please select a data type of text. HTML assets when loaded to Catalog will be categorized as text with a MIME type of "text/html"

Overview

The HTML editor is a powerful way for you to annotate data that must be visualized in a specific manner for your annotation workforce. When you are annotating an HTML file, Labelbox will render the HTML page in the pane where our editor would normally render (seen below).

The HTML editor currently only supports classification type tasks (radio, checklist, and free text).

33383338

Some common use cases for the HTML editor are:

  • Comparing two objects
  • Doing ranking tasks on multiple assets
  • Classifying text that must be formatted in a specific manner
  • Annotating public webpages that have been saved as HTML files

For information on importing HTML files to Labelbox, see our docs on HTML

Supported annotation types

Below are the annotation types that you may include in your ontology for labeling image data. Classification-type annotations can be applied at the global level and/or nested within an object-type annotation.

Annotation type

Import

Export

Radio classification

See reference

See reference

Checklist classification

See reference

See reference

Free-form text classification

See reference

See reference

Sample Use Cases

Comparison of different products

One common use case among our customers is to do a comparison task between two objects or products. This is especially important for any team building algorithms to rank or compare similar objects.

In order to visualize the two products and all the characteristics for the products, we were able to create a custom HTML page that renders all the information needed to help the annotation team add the necessary classification annotations

35783578

We created a custom HTML that can pull in information to compare products. For more details about this, please reach out to [email protected]

Annotating public websites

❗️

We do not support direct linking to webpages via their public URL

Most websites do not support other webpages or applications to open their pages in iframes as a security best practice. As a result, if you do try to link public websites (such as https://google.com or https://yahoo.com) they will NOT render in the HTML panel even if they are html pages.

In order to annotate websites, you will need to first download the webpages of interest as HTML files. This can be done programmatically or manually as seen below:

35683568

Navigate to a page, right click to save as an HTML page.

Once you have the website saved as an HTML file, you can either directly upload it to Labelbox or store it in your cloud storage and send Labelbox a URL link to the HTML. After the HTML page has been uploaded to a dataset in Labelbox, you can queue it for labeling and see that the website is rendered in the labeling flow:

36183618

The website is loaded into Labelbox as a HTML file.


Did this page help you?