Key definitions
Below are the key definitions that you will see in the product, API, and the docs.
Term | Definition |
---|---|
Annotation | An instance of a Feature. Annotations can be imported as ground truth, model predictions, or can be created in the Labelbox Editor. Annotations are categorized as Objects (e.g. bounding box, polygon, etc) or Classifications (e.g. radio, checklist, etc). |
Attachments | Supplementary information you can attach to an asset that provides contextual information used as an aid during labeling. Add attachments |
Asset | A single cloud-hosted file to be labeled (e.g., an image, a video, or a text file). |
Batches/batch mode | A method for sending individual Data Rows to a Project for labeling. This is an alternative to attaching a Dataset to a project. |
Boost | Boost is a service that Labelbox offers to help enterprises scale up their AI/ML operations. Includes a variety of professional services and software assistance. |
Catalog | A Labelbox feature that functions as a training data warehouse where customers can use filters to browse, curate, and develop insights across all labeled and unlabeled data in Labelbox. A warehouse containing all of the Data Rows within an organization. |
Data row | The container that houses all of the following information for a single Asset: - URL to your cloud-hosted file - Metadata - Media attributes (e.g., data type, size, etc.) - Attachments (files that provide context for your labelers) |
Data type | Type of Data Row such as image (JPG/PNG), Video (MP4), text (.txt files) |
Dataset | A set of Data Rows that you add to Labelbox for labeling. |
Editor | The labeling interface you can use to create, review, and edit annotations. When you create a project, you will be prompted to configure your Editor (i.e., select an ontology, add labeling instructions, etc). |
Feature | A feature is the master definition of what you want the model to predict. It is also the blueprint for your ground truth. An ontology is made up of a set of features. There are two kinds of features: objects (e.g., Bounding box) and classifications (e.g., Radio). A feature can have multiple deeply nested sub-classifications. |
Ground truth | Ground truth is information that is known to be real or true, provided by direct observation and measurement (i.e. empirical evidence such as labels made by humans) as opposed to information provided by inference. |
Label | A collection of all annotations on a Data Row. For example, all Bounding boxes, Polylines, and Radio classifications on an image would be considered the "Label". |
Mask | An image representation (in PNG format) of a segmentation annotation sans asset. |
Media attributes | Upon upload, Labelbox automatically computes media attributes for each Data Row. It includes useful information like mimeType, width, height, codec, etc. |
Metadata | Metadata is non-annotation information about the asset to be labeled. There are two types of metadata: reserved keys (user cannot change) and custom (user-defined). Metadata is useful for searching and filtering across your Data Rows in Labelbox. |
Model | A Model is a directory where you can create, manage and compare a set of Model Runs related to a same machine learning task. Each Model is specified by an ontology of data: it defines the machine learning task of the Model Runs inside the directory. |
Model diagnostics | A Labelbox product area that enables you to run experiments (Model Runs) on your machine learning models to analyze model performance across each Model Run. |
Model run | A Model Run is a model training experiment within a Model directory. Each Model Run has its data snapshot (data rows, annotations, and data splits) versioned. You can upload predictions to a Model Run, and compare its performance against other Model Runs in the Model directory. |
Nested classification | A classification-type annotation that is nested within an object-type annotation (as opposed to a global classification). |
Ontology | A collection of Features and their relationships (also known as a taxonomy). Ontologies can be reused across different projects. It is essential for data labeling, model training, and evaluation. When you are in the Editor, the ontology is what appears in the "Tools" panel. |
Prediction | Output from your ML model that you can add to a Data Row to serve as a template for faster labeling. |
Project | The labeling environment in Labelbox, like a factory assembly line for producing labels. The initial state of the project can start with raw data, pre-existing ground truth, or pre-labeled data. |
Queue | Labelbox has four queues for moving Data Rows through the labeling & QA pipeline: the Batches queue, the Labeling queue, the Review queue, and the Tasks queue. |
Schema | Nearly everything in Labelbox is strongly typed. The schema is the master blueprint for your training data. It contains Ontologies, Features, and Metadata. |
Data split | You can split the selected data rows into train, validation, and test splits to prepare for model training and evaluation. |
Template | If you have a Data Row that needs to be relabeled, you can delete the annotations and select to have the existing annotations as a template the next time the Data Row appears in the Editor. This allows you to make corrections to a set of annotations, rather than start from scratch. |
Workspace | Enables admins at large organizations to manage multiple instances of Labelbox with the same login. |
Updated 3 months ago