Key definitions
A glossary of key terms used throughout the Labelbox platform.
Below are the key definitions that you will see in the product, API, and the docs.
Term | Definition |
---|---|
Annotation | A human-made or computer-generated label on an asset. Annotations can be imported (as ground truth or pre-labels) or they can be created manually in the Labelbox editor. Annotations are categorized as Objects (e.g. bounding box, polygon, etc) or Classifications (e.g. radio, checklist, etc). |
Attachments | Supplementary information you can attach to an asset in order to provide contextual information for your labeling team. Attachments appear on a separate side panel Add attachments |
Asset | A single cloud-hosted file to be labeled (e.g., an image, a video, or a text file). |
Batches/batch mode | A method for selecting data rows from Catalog and sending them to a labeling project. Sending batches of data rows to a labeling project is an alternative to attaching an entire dataset to a project. |
Benchmarks | The Benchmarks tool enables you to designate a labeled asset as a “gold standard” and automatically compare all other labels on that asset to the benchmark label. |
Boost | Boost is a service that helps enterprise customers scale up their AI/ML operations. Boost includes a variety of professional services and software assistance, including a labeling workforce. |
Catalog | An organization-wide platform for curating and exploring your unstructured data. Catalog enables you to easily browse, curate, and develop insights across all labeled and unlabeled data rows in your organization. |
Consensus | The Consensus tool allows you to automatically compare labelers against each other by comparing annotations on a given asset. Consensus works in real-time so you can take immediate and corrective actions toward boosting team and model performance. |
Data row | The container that houses all of the following information for a single Asset: - URL to your cloud-hosted file - Metadata - Media attributes (e.g., data type, size, etc.) - Attachments (files that provide context for your labelers) |
Data split | You can split the selected data rows into train, validation, and test splits to prepare for model training and evaluation. |
Data type | Type of Data Row such as image (JPG/PNG), Video (MP4), text (.txt files) |
Dataset | A set of Data Rows that you add to Labelbox for labeling. |
Editor | The labeling interface you can use to create, review, and edit annotations. When you create a project, you will be prompted to configure your Editor (i.e., select an ontology, add labeling instructions, etc). |
Feature | A feature is the master definition of what you want the model to predict. It is also the blueprint for your ground truth. An ontology is made up of a set of features. There are two kinds of features: objects (e.g., Bounding box) and classifications (e.g., Radio). A feature can have multiple deeply nested sub-classifications. |
Ground truth | Ground truth is information that is known to be real or true, provided by direct observation and measurement (i.e. empirical evidence such as labels made by humans) as opposed to information provided by inference. |
Label | A collection of all annotations on a Data Row. For example, all Bounding boxes, Polylines, and Radio classifications on an image would be considered the "Label". |
Media attributes | Upon upload, Labelbox automatically computes media attributes for each Data Row. It includes useful information like mimeType, width, height, codec, etc. |
Metadata | Metadata is non-annotation information about the asset to be labeled. There are two types of metadata: reserved keys (user cannot change) and custom (user-defined). Metadata is useful for searching and filtering across your Data Rows in Labelbox. |
Model | A Model is a directory where you can create, manage and compare a set of Model Runs related to a same machine learning task. Each Model is specified by an ontology of data: it defines the machine learning task of the Model Runs inside the directory. |
Model run | A Model Run is a model training experiment within a Model directory. Each Model Run has its data snapshot (data rows, annotations, and data splits) versioned. You can upload predictions to a Model Run, and compare its performance against other Model Runs in the Model directory. |
Nested classification | A classification-type annotation that is nested within an object-type annotation (as opposed to a global classification). |
Ontology | A collection of Features and their relationships (also known as a taxonomy). Ontologies can be reused across different projects. It is essential for data labeling, model training, and evaluation. When you are in the Editor, the ontology is what appears in the "Tools" panel. |
Prediction | Output from your ML model that you can add to a Data Row to serve as a template for faster labeling. |
Project | The labeling environment in Labelbox, like a factory assembly line for producing labels. The initial state of the project can start with raw data, pre-existing ground truth, or pre-labeled data. |
Queue | Labelbox has 3 queues for moving data rows through the labeling & QA pipeline: the Batches queue, the Labeling queue, and the review tasks queue. |
Schema | Nearly everything in Labelbox is strongly typed. The schema is the master blueprint for your training data. It contains Ontologies, Features, and Metadata. |
Template | If you have a Data Row that needs to be relabeled, you can delete the annotations and select to have the existing annotations as a template the next time the Data Row appears in the Editor. This allows you to make corrections to a set of annotations, rather than start from scratch. |
Workflow | A workflow refers to the queue system for labeling and reviewing assets within a project. Workflows give you more granular control over how your data rows get reviewed by providing a highly customizable, step-by-step review pipeline to drive efficiency and automation into your review process. |
Workspace | Enables admins at large organizations to manage multiple instances of Labelbox with the same login. |
Updated 8 months ago