# Release Notes Source: https://docs.labelbox.com/changelog To subscribe to the Labelbox release notes, please fill out the [subscription form](https://learn.labelbox.com/releasenotessubscription/). ## App *Added* * **Custom roles compatible filters:** Previously, there were no guardrails to prevent admins from adding incompatible permissions when creating custom roles. Now, when you create a new custom role, you can select Project, Workspace, or All to quickly filter for the compatible permissions at each of these scope levels * **Audio transcription in audio & video editors:** You can now transcribe audio directly within the video and audio editors for both global and temporal text classifications. This feature allows you to record audio and have it automatically converted into text, streamlining your labeling workflow. To learn more see our [Audio transcription docs](/docs/video-editor#audio-transcription). *Removed* * The Public Demo workspace has been sunsetted and is no longer available for use. ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. ## App *Added* * **AI critic in the Multi-modal chat editor**: You can now use natural language to automatically critique any element in the MMC editor before label submission. This makes it easier to catch issues in real time and standardize QA across projects while reducing manual review overhead. * **AI generated rubrics in the Multi-modal chat editor**: You can now automatically generate rubrics from your task instructions and examples inside the Multi-modal editor, enabling you to spin up consistent, detailed rubrics for new projects and iterate on grading criteria without starting from scratch. * **Improved auto-segmentation in the Video editor (Meta’s SAM3)**: The Video editor now uses Meta’s SAM3 (Segment Anything Model) for auto-segmentation, producing higher-quality masks and faster object selection on video frames. * **Assign data rows to individuals in Consensus mode**: You can now assign specific data rows to individual contributors even when using Consensus projects. Enables more control over who labels what (e.g., for training, audits, or specialization) without giving up the quality benefits of consensus. * **New “Media Duration” Throughput chart**: In the Throughput section in the Performance Dashboard, there is a new chart called “Media Duration” that shows the total time of media duration recorded over a time series across all labelers. *Changed* * **Default roles when creating groups**: When you create a Group, you’ll now be prompted to select a default role for members. This change is designed to reduce the changes of misconfiguring access for group members and makes it easier to configure baseline permissions consistently. * **Project overview tiles**: In the Project overview under project metrics, the number of hours is now recorded in the “Done” tile. *Removed* * The following models in Foundry are no longer supported: Claude Sonnet 3.7, Claude Haiku 3.5 * The "Annotation usage" chart under Workspace settings > Usage has been removed. This legacy chart was originally built to support older annotation-based billing and is no longer used for billing calculations. ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. ## App *Added* * In the Monitor tab and project’s Performance > TEQ > Throughput charts, the “Done”, “Labels”, “Annotations”, and “Reviews” tiles now have a chip in the top right corner that displays Average daily throughput when you hover over them. * In the project’s Performance > TEQ > Throughput section, you can toggle ON the “Forecast” setting to display the forecast for the next time period. See the [Performance Dashboard docs](/docs/performance-dashboard) for more details. * For Video projects, the following controls now exist in Project settings > Advanced: * “Video Timeline V2”: When this is toggled ON, the video editor timeline will be updated to the new UI. * “Timeline minimum visible duration”: Use this setting to define the max zoom level control for your timeline (i.e., how many seconds from left border to right border). For example, if this setting is set to 1, the entire span of the timeline view will be set to 1 second. * “Disable automatic split at playhead for classifications/objects”: When this is toggled ON, the segment under the playhead in the timeline view will be selected and any change will be applied to the entire segment. To make a partial change to a segment at a specific point in the timeline, click “Split” to create a new segment and then apply the change. * For Audio projects, the following controls now exist in Project settings > Advanced: * “Audio Timeline V2”: When this is toggled ON, the video editor timeline will be updated to the new UI. * “Timeline minimum visible duration”: Use this setting to define the max zoom level control for your timeline (i.e., how many seconds from left border to right border). For example, if this setting is set to 1, the entire span of the timeline view will be set to 1 second. * “Disable automatic split at playhead for classifications”: When this is toggled ON, the segment under the playhead in the timeline view will be selected and any change will be applied to the entire segment. To make a partial change to a segment at a specific point in the timeline, click “Split” to create a new segment and then apply the change. * The Audio & Video editors received the following improvements: * Segments shorter than 300ms now show a warning icon to help users identify and optionally remove invalid micro-segments, reducing annotation noise. * You can now jump to a specific time in the audio timeline by typing the desired timestamp. * Start and end handles now appear on the timeline segments, making it easier to resize segments by dragging. * You can now use the option+arrow shortcut to navigate keyframes. * With the new AI critic in the Multi-modal chat editor, you can use natural language to target any element in the editor and auto-generate a critique for each element. You can enable this by going to project settings > Advanced > toggle “Enable AI Critic” ON. In order to use this feature, Foundry billing must be enabled for your organization. *Changed* * In the Audio editor, you can now scroll beyond the end of the audio to access annotation groups that were previously cut off from view. * In the Audio and Video editors, when you select an issue the timeline automatically scrolls to the nearest relevant keyframe. *Future* * Starting February 9th, the new Audio/Video timeline V2 will be enabled for new projects by default. Projects created before February 9th will continue to default to the timeline V1, and you can enable or disable it anytime in the Advanced project settings. * Per Claude, claude-3-7-sonnet will be removed as it was deprecated in December and will be retired on February 19th. We recommend moving to claude-4-5-sonnet. ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. ## App *Added* * The new timer details in the editor provide comprehensive time tracking for labeling, review, and rework modes. The timer will show progressive warnings with visual alerts as the labeler approaches the target maximum time, exceeds the target maximum time, and reaches the cutoff threshold. * If network connection issues occur while labeling, the editor displays a warning modal to notify the user of the issue and another modal when the network connection is restored. The timer will stop when a network connection issue is detected and resume when the issue is resolved. * In the project’s Workflow tab, you can configure the maximum task limit for specific users or the cumulative task limit for specific groups. To configure this setting, go to the Workflow tab, select the Initial labeling task, select Edit workflow, and configure the values in the Per-user/group task limits section. * You can now route rework tasks back to the original labeler. To enable this, go to the Workflow tab, select the Rework step, and update the Individual assignment field to Label creator. * You can now configure a maximum number of rework attempts per contributor. This allows you to set a threshold, and once that threshold is reached, the task will be routed to a different contributor. To configure this setting, go to your project’s Workflow tab, open the Rework step, make sure the Individual assignment setting is set to Label creator, and set the value for Max rework attempts per contributor. * Reviewers can now assign a score on a per-label basis, and the Labelbox platform will automatically calculate an average score across all attempts for each contributor. When a labeler’s average score falls below the configured threshold, the system can automatically remove them from the project or queue. To configure this setting, go to your project’s Advanced settings and set a value for Max rework attempts. Make sure the Individual assignment in your Rework step in Workflows is set to Label creator. *Changed* * In the Audio editor, when you change the timeline resolution/zoom, the timeline now scrolls to keep the playhead in view instead of jumping back to the start. ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. ## App *Added* * The Multi-modal chat editor has a new “Data capture” project configuration. When projects are set to this mode, you can upload videos, images, PDFs, or other files as attachments for general-purpose data collection without a model. You may enable this configuration at project setup. * In the Advanced project settings, you’ll find a new option called “Authentic input protection”. When enabled, this feature blocks paste actions, detects suspiciously rapid entry, and automatically creates an issue on the data row when either of these suspicious behaviors is detected. This is available for all project types. * In the Multi-modal chat editor, you can now define minimum and maximum character counts in the prompt. Admins can enable this feature in the Advanced project settings (see “Min prompt message character length” and “Max prompt message character length”). * In the Multi-modal chat editor, there is an option to display model responses in a vertical message layout. Admins can enable this feature by toggling ON the “Vertical response messages” toggle in the Advanced project settings. * The following are now available in the Model tab: * Gemini 3 Pro Preview * GPT-5.1 * Claude Opus 4.5 *Changed* * The Multi-modal chat editor has received some improvements that significantly boost speed and performance when working with data rows containing many classifications. *Removed* * The following have been removed from the Model tab: * Llama 3.1 ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. ## App *Added* * The Labelbox app now displays charts for average handling time (AHT). To learn more about these new AHT charts, visit our [docs](/docs/performance-dashboard). You will find new charts in the following places: * Annotate > Performance > TEQ > Efficiency * Monitor * All editors now display a timer, giving labelers and reviewers visibility into how long they've spent on each task. This is now enabled for all customers. * In the Audio editor, you can now select a range of frames on the audio timeline and modify all classification values that belong to that portion of the timeline. This multi-selection support encompasses all operations available in the audio editor. * The Multi-modal chat editor now supports a multiple conversation interface that allows labelers to work on multiple conversations at once. To enable this, click the "Conversation list" icon in the top navbar in the editor. * In the Multi-modal chat editor, project admins can use the new "Response count" field to set multiple model response generations for every turn simultaneously. This "Response count" is supported for both single and multiple models connected. * Project admins can now enable the "Thresholding" toggle for Multi-modal chat projects. When enabled, labelers can evaluate each response as passing or failing, and then the editor will display pass\@k scores automatically after all evaluations are completed. * In the Multi-modal chat editor, you can now clip your recorded audio segments. * In the Multi-modal chat editor, admins can configure the editor to display specific classifications for each turn of a conversation. To enable this setting, go to the project setup page, create a new classification, set "Scope" to "Narrow", and under "Narrow scopes" select "Turn". * The following are now hosted in the Model tab: * Gemini 2.5 Flash Image Preview * Claude Sonnet 4.5 * Claude Sonnet 4.5 Vertex * GPT-4o Diarize * Veo 3.1 * Veo 3.1 Fast ## Python SDK To view the latest updates to our Python SDK, see the [**What’s new**](https://docs.labelbox.com/reference/whats-new) section in the developer guides. ## App *Added* * The z-order visualization capability is now available in the Catalog view. * In the audio editor, it is now possible to select a range of the timeline and modify all classification values in bulk. * In the Model tab, Claude Sonnet 4.5 is now supported. *Changed* * In the video editor, you can now move a sub-classification at the global level to the frame level. * You can now conditionally allow sub-classifications based on more than one parent classification. ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. ## App *Added* * The Performance Dashboard now contains a bar chart that shows the number of labeling tasks per labeler. * The SAM2 auto-segmentation model is now integrated into the video editor. Now, a labeler can click a few positive and negative points for objects on a single frame they want to track and select a frame range. SAM2 will automatically add segmentation masks to the object over the selected frame range, reducing manual work. * The Models tab now supports the following models: * Open AI GPT-5 * Claude Opus 4.1 * Veo 3 * Google Imagen 4 * The Multi-modal chat has been updated with the following: * You can now add turn-level instructions as metadata (see `turnInstructions` in the [docs](/docs/datarow-metadata)). * The Multi-modal chat editor now supports video generation, starting with Veo 3. * You can now edit the behavior of the rework task in workflows to send rejected labels back to the original label creator only. * If "Skip Data Row" in the advanced project settings is toggled on, when a data row is skipped five times, it will move to the next task queue. *Changed* * When a data row gets skipped in the editor, the data row will stay in the “To label” queue instead of moving to the “Review” queue. * The default real-time resolution setting in the video editor is now set to 1080p. * We made the following improvements to the Multi-modal chat editor * When you upload an image during a turn, the image won’t be saved as an attachment until you submit the turn. * When you reset a turn, you have the option to leave the image attachment on the turn or manually remove it. * The hotkey for "Submit / Save" in the editor has been updated to "E E". ## Python SDK To view the latest updates to our Python SDK, see the [What’s new](/reference/whats-new) section in the developer guides. ## App *Added* * The following models are now hosted in the Model tab: * Deepseek R1 * OpenAI o3 deep research * OpenAI o4-mini deep research * Grok-4 * Kimi K2 Instruct * Timer details (elapsed time) are now available in the editor. Click the timer icon to view elapsed labeling time on the current data row. * In the editor, polygons connected by a relationship will be highlighted for better visibility. * Bounding box objects can now be toggled to force square shapes. *Changed* * Our docs have been migrated to a new platform. You should see an improvement in readability and searchability. *Removed* * The following models were removed from the Model tab: * o1-preview * o1-mini * The Census integration has been sunsetted and is no longer supported. *Future* By the end of 2025, we’ll be expanding our list of whitelisted IPs. We will send out more details about this upcoming change soon. ## Python SDK To view the latest updates to our Python SDK, see the [What’s new](/reference/whats-new) section in the developer guides. ## App *Added* * Previously, the Multi-modal chat editor only supported XLS, PDF, and images. * Now, all file types can be uploaded. If a model cannot read a particular file type, a warning will be displayed to the user. * The Multi-modal chat editor now supports rubrics that can be created and customized at the per-prompt level. Currently, the scoring mechanisms supported for rubrics are binary responses and numerical scores. * Instead of having to select “Start reviewing” to approve or reject a data row, you can now approve/reject a data row by selecting a data row from the Data rows tab in a project. This will open the Task browser, where you will be able to approve/reject the data row. From this view, you can also choose the winning label for Consensus projects. * In the Workflow tab, when you edit the initial labeling task, you can specify the maximum number of labels for that task. You can use this mechanism to ensure that data rows are evenly distributed among the labelers assigned to the project. * The following models are now available in the Model tab: * Qwen3 235B * Perplexity Sonar Pro *Changed* * You can now edit model config parameters in a Multi-modal chat project without having to create an entirely new project. * Previously, if a model failed in the Multi-modal chat editor, the user would need to regenerate all responses. Now, if one model fails and others succeed, the successful model responses will show, and the user can trigger a regeneration of just the model response that failed. * Now, the Multi-modal chat editor displays more descriptive error codes when a model fails to generate a response. * The Multi-modal chat editor now supports models responding with multiple messages and a variety of asset types, including images and videos. * The Multi-modal chat editor now supports human rewrites for 2+ models. * The Catalog filter, “Label Actions”, has a new option to filter for skipped data rows (“Is skipped”). * Now, when you create relationships in the editor with the relationship tool, the relationship will be validated upon creation. *Removed* * The following models have been removed from the Model tab: * Gemini 1.5 Flash ​ ## Python SDK To view the latest updates to our Python SDK, see the [What’s new](/reference/whats-new) section in the developer guides. **Release notes** ## App *Added* * When you create a project and select the Multi-modal chat editor, you’ll be prompted to select from a dropdown of preconfigured settings for your project. If needed, you can update these settings in the advanced project settings section. * New models added to Foundry * Amazon Nova Sonic * Amazon Nova Premier * Open AI GPT-4o Image Generation * Meta Llama 3.3 * Anthropic Claude 4 *Changed* * The “start transcribing” button in the audio editor will be disabled unless your organization has access to Foundry. * The issues creation flow in the Multi-modal chat editor has been slightly updated to display a more discreet issues icon (caution icon). Clicking this icon allows you to create, resolve, and reopen issues in the task. *Removed* * Removed these models from Foundry * Google Gemini 2.0 Flash Experimental * xAI Grok Beta *Future* * On June 30, 2025, we will sunset the [Census integration](/docs/census-integration) due to a lack of usage. If you have any questions or concerns, please contact [support@labelbox.com](mailto:support@labelbox.com). ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. Read More **Release notes** ## App *Added* * The new form-based UI in the Multi-modal chat editor makes the labeling experience more intuitive, displays instructions at each labeling task, and makes the tools menu more accessible. * In the Multi-modal chat editor, you’ll see a minimap on the right that indicates the location of the errors to address before the task can be submitted. Selecting the red markers in the minimap will bring you to each error in the conversation. To learn more, read this [blog post](https://labelbox.com/blog/ai-evaluation-labelboxs-redesigned-multimodal-chat-editor-04-2025/). * We added turn-based audio and video support to the Multi-modal chat editor to enable labelers to generate video training data on real-world examples. You can enable this in the project settings. * We added real-time audio and video support to the Multi-modal chat editor to enable labelers to submit video as a response to the model in real time. You can enable this in the project settings. * The new “Model selection” task option in the Multi-modal chat editor makes it more intuitive for labelers to select which model response is better, indicate that both are bad, or are “good”. You can configure this task to function as a likert scale. To configure this task at the global level, go to Schema → Ontologies → Create → Model chat evaluation → Model selection tasks → Global → Add. To configure this task at the turn-level, go to Schema → Ontologies → Create → Model chat evaluation → Model selection tasks → Per turn → Add. * Foundry now supports the following models: * Gemini 2.5 Pro * Meta Llama 4.0 Maverick * Grok 3 * OpenAI GPT 4.1 * OpenAI GPT-4o Transcribe * OpenAI GPT-4o mini Transcribe * OpenAI GPT-4o mini TTS * OpenAI o3 * OpenAI o4-mini *Changed* * In the Data Rows tab, you can now expand column widths and reorder columns in the table. * Within a project, the "Automation" tab has now been renamed to the "Import labels" tab. Here you can find your import jobs used for MAL predictions, ground truth, or benchmarks. There is also helpful code recommended that includes information from the ontology assigned to the project. *Removed* * April 30, 2025, we started introducing changes to the Python SDK that will break compatibility with Export v2 non-streamable methods in version 3.66 and earlier. To see which methods will be impacted by this change, visit our [Deprecations](/docs/deprecations#export-v2-nonstreamable-methods) page. *Future* * On June 30, 2025, we will sunset the [Census integration](/docs/census-integration) due to lack of usage. If you have any questions or concerns, please contact [support@labelbox.com](mailto:support@labelbox.com). ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. Read More **Release notes** ## App *Added* * We have rolled out an improved experience for managing workspace members and groups to all tiers. To learn more about these changes, read [Manage members and permissions](/docs/manage-members-and-permissions). *Changed* * When you select the Workflows tab in a project, you’ll notice the UI has been updated. This new Workflow editor allows you more flexibility when creating and arranging the steps in your project workflow. To learn more, see our docs on [Workflows](/docs/workflows). * If you have MFA enabled for your organization, you will see a checkbox upon login that offers to “Remember this device for 30 days”. * Integrating AWS S3 via IAM now requires you to create a custom trust policy instead of selecting the Labelbox AWS account. This change gives you more granular control, making the integration more flexible and customizable. See our docs on [AWS S3 integration](/docs/import-aws-s3-data) to learn more. *Future* * On April 30, 2025, we will introduce changes to the Python SDK that will break compatibility with Export v2 non-streamable methods in version 3.66 and earlier. To see which methods will be impacted by this change, visit our [Deprecations page](/docs/deprecations#export-v2-nonstreamable-methods). * On June 30, 2025, we will sunset the [Census integration](/docs/census-integration) due to lack of usage. If you have any questions or concerns, please contact [support@labelbox.com](mailto:support@labelbox.com). ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. Read More **Release notes** ## App *Added* * In the image editor, you can now record an audio clip as a response, and the editor will automatically transcribe and save the audio recording as a classification. To learn more, read our Image editor guide. * In the Services tab, we introduced an NLP search option to make it easier to find candidates that have certain skills or specializations. To use the NLP search, go to the Services tab, select the “Natural language” filter, and enter one or more keywords to narrow down the search results. * When you are labeling code in the Multi-modal chat editor, you now have access to a full VSCode IDE web instance and a remote host environment, allowing you to work on entire code repositories, run CLI tools, execute code, use debuggers, write tests, install additional VSCode extensions, and use Clari copilot to write/edit code blocks. To use this feature, create a project with the Multi-modal chat editor, edit the code block, and select the “Edit in coder” button to open the IDE web instance. * The following models are now supported in the Model tab (see “Hosted by Labelbox” section): * Anthropic Claude 3.7 Sonnet * Anthropic Claude 3.7 Sonnet Think * Google Gemini 2.0 Flash * Google Gemini 2.0 Flash-Lite * Google Gemini 2.0 Pro *Changed* * The home page was improved recently to highlight our labeling services and to make our core features easier to find. * We are gradually rolling out an improved experience for managing workspace members and groups. To ensure a smooth transition, these changes will be rolled out tier by tier. To learn more about these changes, read [New member and group management experience](/docs/new-member-and-group-management-experience). *Removed* * On February 13th, we disabled the “Reporting” tab (AKA Enterprise Dashboard) for all customers. The Reporting tab has been replaced by the new Monitor tab. * On February 28th, we disabled the Catalog cluster view + Smart select feature for all customers. *Future* * On April 24, 2025, we will sunset exports v2 non-streamable for all customers using Python SDK versions 3.66 or earlier. ## Python SDK * To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. Read More **Release notes** ## App *Added* * The following models are now hosted and available in the Model tab: * OpenAI o1 & o3-mini * OpenAI Whisper * Google Gemini 2.0 Flash Thinking * Google Gemini 2.0 Flash Experimental * Amazon Nova Micro, Lite, and Pro * To increase both the speed and accuracy of extracting text from a PDF, labelers can now draw bounding boxes around text in a document, and the PDF editor will automatically extract the text from the bounding box. * The Audio editor now automatically transcribes speech to text to make labeling audio files easier. *Changed* * The Multi-modal chat editor now includes an improved experience for editing and running code. To learn more, see our [Code runner docs](/docs/code-grammar-assistance#code-editor). * The AI critic in the Multi-modal chat editor now has an improved experience for getting suggestions on coding tasks. * The Services tab is now easier to navigate and has an improved experience for filtering results. *Future* * On February 13, 2025, we will sunset the “Reporting” tab (AKA Enterprise Dashboard) for all customers. The Reporting tab has been replaced by the new Monitor tab. To learn more about this change, refer to [this guide](/docs/migration-guide-reporting-page-to-monitor). ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. Read More **Release notes** ## App *Added* * The Multi-modal chat editor now has a built-in code runner that runs and detects code blocks. This feature allows labelers and reviewers to execute code directly in the editor and see the results. Code execution results are persisted and included in exports. To learn more, see our [Code runner docs](/docs/code-grammar-assistance#code-runner). This feature is currently in beta. * Polygons, polylines, bounding boxes, and points now have a z-order and are rendered so that those in front occlude or block out those behind. Their z-order can be changed using a new view in the Objects panel. To learn more, see our docs on [Editor settings](/docs/image-editor#setting-object-positions). * The new “Polygon snapping” feature allows points on lines, polygons, and bounding boxes to align or attach to polygon edges. You can toggle on polygon snapping in the [Editor settings](/docs/image-editor#editor-settings). * To encourage labeler understanding of project instructions, labelers will be enforced to read and acknowledge project instructions before working on a project. If project instructions are modified, labelers will be required to re-read instructions before they can resume labeling in the project.Enforced labeling instructions **only** impacts customers who opt into using Alignerr labeling services for their projects. * In the PDF editor, you can now create bounding boxes on the page and use the relationship tool to connect the bounding boxes to classifications. This feature is available upon request (email [support@labelbox.com](mailto:support@labelbox.com) to request access). * In the new labeling services marketplace, you can view Alignerr labeler profiles and select labelers for your next project based on their expertise and availability. To browse Alignerr labeler profiles, select “Services” from the left nav bar. To learn more, see our docs on [Alignerr Connect](/docs/alignerr-connect). *Changed* * After a series of bug fixes, the custom roles feature now has an improved user experience. Fixes mostly include changes to make permission settings more reliable. * You can now download the history of changes made to a user group as a CSV file. To do this, go to Workspace settings > Groups > Export > Export history. * The latest timer mechanism has now been rolled out to all remaining organizations. This new timer mechanism offers more accurate measurements for labeling, reviewing, and reworking time. *Future* * On February 13, 2025, we will sunset the “Reporting” tab (AKA Enterprise Dashboard) for all customers. The Reporting tab has been replaced by the new Monitor tab. To learn more about this change, refer to [this guide](/docs/migration-guide-reporting-page-to-monitor). ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. Read More **Release notes** ## App *Added* * In-editor suggestions are now available in the Multi-modal chat and Prompt & response editors. You can use this AI-powered tool to double-check your grammar and wording to ensure your prompts are written correctly. This feature is enabled on a project-by-project basis. * The new “step-by-step reasoning” task in the Multi-modal chat editor allows labelers to classify each step in the model response as “correct” or “incorrect”. Then, the model regenerates its response starting from the incorrect segment while preserving the previous responses. To learn more about how this works, read [this guide](https://labelbox.com/blog/multi-step-reasoning-teach-llms-to-think-critically/). * The new “fact-checking” task in the Multi-modal chat editor allows labelers to easily rate any step in the response as “accurate”, “inaccurate”, “disputed”, “unsupported”, etc. * The new “prompt-rating” task in the Multi-modal chat editor allows labelers to select from a pre-set list of toggles to signal that the content should not be labeled for a specific reason (e.g., offensive content, contains PII, not understandable, etc). *Removed* * On November 7th 2024, we removed the native support for YOLO models in Foundry. If you would like to set up your own YOLO model for inferencing, refer to our [custom model integration docs](/docs/custom-model-integration#create-model-integrations-for-bounding-box-and-mask-tasks). * On November 7th 2024, we disabled the model fine-tuning feature, meaning the image fine-tuning is no longer available. We may re-enable this feature in the future. * On November 25th 2024, we sunsetted the DICOM editor in our platform for all customers. Labeling DICOM data is no longer supported in Labelbox. *Future* * On February 13th 2025, we will sunset the “Reporting” tab (AKA Enterprise Dashboard) for all customers. The Reporting tab has been replaced by the new Monitor tab. To learn more about this change, refer to [this guide](/docs/migration-guide-reporting-page-to-monitor). ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. Read More **Release notes** ## App *Added* * In the video editor, classifications can now be arbitrarily nested. Nested classifications can also be displayed in the timeline. * When you are configuring classifications in an ontology, you will see an option to toggle on/off the Likert scale option. Enabling the Likert scale option will automatically populate labeler classification responses with the first response option listed. * Now, when you enter a response in the multi-modal chat editor, you’ll see a button for “Get suggestions”. When you enable this AI-powered tool, you will get suggestions (code or grammar) to catch any mistakes you may have overlooked in your response. This feature is currently in beta. * When you are in review mode, you will now see Consensus scores next to the annotation names in the Tools menu. * You can now enable email notifications to stay up-to-date on changes related to project assignments, batch updates, and issue activity. To learn more, see our docs on [Notifications](/docs/notifications). *Fixed* * The Usage page got some UI improvements and additional filters for better usability. Changes include fixes to LBU calculation for custom dates, graphing capabilities, and LBU display on modal dialogs. * The toggle-off and backspace issues in the video editor have also been fixed. *Future* * On November 7th 2024, Labelbox will sunset native support for YOLO models in Foundry. If you would like to set up your own YOLO model for inferencing, refer to our [custom model integration](/docs/custom-model-integration#create-model-integrations-for-bounding-box-and-mask-tasks) docs. * On November 7th 2024, Labelbox will disable YOLOv8 for the model fine-tuning feature. This means that image fine-tuning will not be usable. We may re-enable image fine-tuning in the future. ## Python SDK * To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. Read More **Release notes** *Added* * The new Labelbox Microsoft Entra application allows you to install a verified version of the Labelbox Enterprise application into your Azure tenant. To learn how this works, see [these docs](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/aad.labelbox?tab=Overview). * The Performance tab in Annotate now contains an option to enable outlier detection metrics. These metrics highlight and bucket labelers who are outliers in key areas for quick remediation. * Customers who want to connect their own models to Labelbox now have a self-serve, product-integrated UI to bring their own models for LLM, classification, text, and NER use cases. Bounding boxes and masks will still require manual onboarding for the time being. To learn more, see our docs on [Custom model integration](/docs/custom-model-integration#create-model-integrations). * The new Monitor page is now available for Enterprise customers. The Monitor contains charts, tables, and filters to display project and member performance across workspaces. It also contains new bulk actions and outlier detection metrics. To learn more, see our [Monitor](/docs/monitor) docs. *Changed* * Asset proxy (the feature that hides raw data row URLs from being exposed in the platform) is now supported for all project types. To request access to this feature, contact our support team. ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new) section in the developer guides. Read More **Release notes** ## App *Added* * Now, you can subscribe to email notifications for the following: 1) when labelers are assigned to (or unassigned from) a project, 2) when an issue is created, resolved, or commented on and 3) when a batch of data rows is added to a project or has been labeled. To manage email notifications, go to your user profile in the app UI. *Changed* * The Multimodal chat editor has a new and improved UI. Key changes include: 1) When ranking, selecting, and classifying, the turn remains horizontal and 2) Instead of a left side panel, each turn is self-contained with its own tasks and classifications. * When you create new workflow tasks, you have more task filters to choose from: Issue category, Dataset, Batch, Metadata, Model prediction, Labeling time, Review time, Natural language, and Label feedback. * When you add task filters to workflow tasks, you can select "Match: Any" to apply OR logic or select "Match: All" to apply AND logic to the filters. * You will see an updated flow when requesting labeling services via the UI. *Removed* * Export v1 has been sunsetted for all customers. ## Python SDK To view the latest updates to our Python SDK, see the [What's new](/reference/whats-new#version-3770-2024-08-09) section in the developer guides. Read More **Release notes** ## App *Added* * The multimodal chat editor now supports offline multimodal chat evaluation projects. To learn how this works, see our [Multi-modal chat evaluation docs](/docs/multimodal-chat-evaluation-editor#set-up-offline-multimodal-chat-evaluation-projects). * The Multimodal chat evaluation editor now supports LaTeX math expressions. To learn more, see our [Multi-modal chat evaluation docs](/docs/multimodal-chat-evaluation-editor#step-2-configure-models). * Enterprise customers will see an updated flow for requesting labeling services in the app UI. * You can now use the Benchmarks & Consensus tools for prompt & response generation projects. * You can now use the Benchmarks & Consensus tools for video projects. * Benchmarks & Consensus are now enabled for the following workflows: Send data rows from Model to Annotate, send data rows from Annotate to a model experiment, and Bulk classification in Catalog. * In the Data rows tab in Annotate, you will see an improved search and filtering experience for data rows with Benchmarks. * The audio editor now supports temporal classifications and a timeline, so you can assign classifications to specific points in the audio timeline. * When you compare model runs in the metrics and cluster view in Model, it will only display the metrics calculations for the data rows that appear in both model runs. *Changed* * You can now configure your projects to use Benchmarks & Consensus simultaneously. If you have a project that has already been set up with Benchmarks, you can enable Consensus later in the project settings. However, it is still not possible to disable once set. * In the Data row browser, you can now set a Consensus label as a Benchmark label. You can do so by selecting a data row in the Data rows tab that has a Consensus label and select Add as Benchmark. This will calculate both the Benchmark and Consensus score for the data row. * The left side panel is now resizable across all editors. * The default reservation count for the labeling queue has been updated from 10 to 4. Admins can adjust the queue parameters in the project settings. * The Auto-segment tool in the image editor has been upgraded to use SAM 2. *Removed* * The custom editor has been sunsetted for all customers. * The following data connector libraries have been archived: labelbase, labelspark, labelpandas, labelsnow, labelbox-bigquery. To learn more, see our [Deprecations page](/docs/deprecations#data-connector-libraries). * The External workforce tab has been removed from the project settings. ## Python SDK The latest version of our Python SDK is v3.76.0. See our [full changelog on GitHub](https://github.com/Labelbox/labelbox-python/blob/develop/libs/labelbox/CHANGELOG.md#version-3760-2024-07-29) for more details on what was added recently. ### Version 3.76.0 (2024-07-29) *Added* * Added Project get\_labeling\_service(), request\_labeling\_service() and get\_labeling\_service\_status() * Added project and ontology creation for prompt response projects: Client create\_prompt\_response\_generation\_project(), create\_response\_creation\_project() * Added is\_benchmark\_enabled, is\_consensus\_enabled to Project *Updated* * Made Project quality modes a list to allow combining more than 1 quality mode per project *Notebooks* * Added back in export migration guide * Added correct data param to video notebook *Other* * Use connection pool for all http and graphql requests ### Version 3.75.1 (2024-07-16) *Removed* * Project MEDIA\_TYPE JSON ### Version 3.75.0 (2024-07-10) *Added* * Added Project set\_project\_model\_setup\_complete() method * Added user group management * Refactor Dataset create\_data\_rows\_sync to use upsert * Added upload\_type to Project * Added prompt classification for python object support * Alias `wait_xxx` functions *Fixed* * Predictions missing during Catalog slice Export * Prevented adding batches to live chat evaluation projects * Added missing media types * Deprecate Project setup\_editor and add Project connect\_ontology * Bumped dateutil max version * Bumped version rye * Updated create ontology for project setup Read More **Release notes** ## App *Added* * For video projects, you can now assign free-form text classifications to individual frames or a range of frames. * The Usage tab in Workspace settings now contains LBU usage per product and data modality. It also displays usage and project billing costs for Boost Workforce and Boost Workforce Express projects. To learn more, see our docs on [Account details](/docs/view-billing-details#data-row-usage). * Enterprise and pay-as-you-go customers will see an improved setup flow when configuring a data warehouse integration. When you initiate a new integration in the Integrations tab, you'll see step-by-step instructions to guide you through the setup process. To learn more, see our [Census integration](/docs/census-integration) docs. * Admins can define their own user roles with any set of permissions that Labelbox provides and allow those roles to be assigned to any user. To learn more, see our docs on [Member roles](/docs/roles-and-permissions#member-roles). * Benchmarks and Consensus are now supported for projects created using the “Prompt and response generation” editor. This is only supported when you select the “Humans generate responses to uploaded prompts” option. * Claude 3.5 Sonnet is now available in Foundry. * Images are now supported for Claude 3 Opus, Claude 3.5 Sonnet, and Claude 3 Haiku in Foundry. *Changed* * You can now configure your project settings to use the Benchmarks and Consensus tools at the same time. * Contract-based customers will see an updated flow when requesting Boost workforce services. * We made some improvements to our annotation import formats to make them more intuitive. To see the updated import formats, click through the links at the top of the [Import ground truth doc](/docs/import-ground-truth). *Fixed* * A fix that blocks concurrent edits to the same label has addressed the platform issues causing duplicate classifications, multiple segmentation groups, and options with deleted questions. * The root cause of certain data integrity issues, such as missing pixels in raster segmentation masks and multiple/duplicate classifications, has been addressed. * A fix in our queueing system has boosted Export v2's overall performance, significantly improving export performance on larger scales. *Removed* * Claude Instant, Claude 2, and Claude 3 Sonnet have been removed from Foundry due to lack of usage. * The custom editor has been sunset for all customers. Custom editor projects are no longer accessible (exporting is also disabled) via the app and the SDK. Custom editor projects will no longer appear in your project list. ## Python SDK The latest version of our Python SDK is v3.74.0. See our [full changelog on GitHub](https://github.com/Labelbox/labelbox-python/blob/develop/libs/labelbox/CHANGELOG.md#version-3740-2024-06-24) for more details on what was added recently. ### Version 3.74.0 (2024-06-24) *Added* * Include predictions in export * Upsert label feedback method Client `upsert_label_feedback()` ### Version 3.74.0 (2024-06-24) *Added* * Include predictions in export * Upsert label feedback method Client `upsert_label_feedback()` *Removed* * Removed deprecated class `LabelList` ### Version 3.73.0 (2024-06-20) *Added* * Conversational data row checks * UI ontology mode support * Empty data row validation *Fixed* * Numpy semver locked to \< 2.0.0 ### Version 3.72.2 (2024-06-10) *Added* * SLSA provenance generation ### Version 3.72.1 (2024-06-06) *Fixed* * Fix `client.get_project()` for LLM projects * Throw user-readable errors when creating a custom embedding Read More **Release notes** ## App *Added* * Now you can browse video assets (as well as their annotations and predictions) in the detailed view within Catalog, Model, and Annotate. * Our Foundry product is now available as GA. Foundry enables you to use powerful foundation models to prelabel your data and integrate directly with Annotate workflows. To learn more, read our docs on [Foundry](/docs/foundry). * Now, when you create an API key, you can specify its scope to a particular permission level. See our docs on [API keys](/reference/create-api-key) to learn more. * You can now [clone projects](/docs/create-a-project#duplicate-a-project) and their settings (ontology, members, name, tags, issue categories). To do this, go to Annotate, select a project, click on the project name at the top, and select Duplicate project. * We introduced a new role called tenant admin. This tenant admin role has the permission to create new workspaces and invite team members to any workspace. See our docs on [Tenant Admin](/docs/workspaces#tenant-admin) to learn more. * OpenAI's GPT-4o, as well as Anthropic's Claude 3 Sonnet and Claude 3 Haiku models, are now available in Foundry. * When you open an experiment from Model, you'll see a new view called "List view". This new view displays each data row as a row in a list. Each row displays a preview of the asset and any annotation and/or prediction information for the data row. See our docs on [Model runs](/docs/model-runs) to learn more. * Now, in the data rows tab, you can filter data rows by agreement score at the feature level. To do this, go to the Data rows tab, filter by "Consensus average," and select "Feature". You can further narrow down your results by feature schema. To learn more, see our docs on the [Concensus](/docs/consensus). * We released a new [live multimodal chat evaluation editor](/docs/live-multimodal-chat-evaluation) that enables you to enter a prompt, get a response from multiple models, and use an ontology to rank and select the best model responses. *Removed* * On March 16, 2024, we sunset Export v1 for pro, standard, and select enterprise customers. *Future* * On June 30, 2024, we will sunset the custom editor for all customers. ## Python SDK The latest version of our Python SDK is v3.72.0. See our [full changelog on GitHub](https://github.com/Labelbox/labelbox-python/blob/develop/libs/labelbox/CHANGELOG.md#version-3720-2024-06-04) for more details on what was added recently. ### Version 3.72.0 (2024-06-04) *Added* * Update Dataset create\_data\_rows to allow upload of an unlimited number of data rows * New Dataset methods for iam\_integraton: add\_iam\_integration, remove\_iam\_integration *Notebooks* * Added model evaluation SDK method notebook * Added quick start notebook geared towards new users ### Version 3.71.0 (2024-05-28) *Added* * project.get\_overview() to be able to retrieve project details * project.clone() to be able to clone projects * ExportTask.get\_buffered\_stream to replace ExportTask.get\_stream *Fixed* * ExportTask.result / ExportTask.errors parsing content incorrectly * Lack of exceptions related to updating model config ### Version 3.70.0 (2024-05-20) *Added* * Added chat model evaluation support * client.create\_model\_config() * ModelConfig project\_model\_configs() * ModelConfig add\_model\_config() * ModelConfig delete\_project\_model\_config() * ProjectModelConfig delete() * client.create\_model\_evaluation\_project() * Update existing methods to support chat model evaluation project * client.create\_ontology() * client.create\_ontology\_from\_feature\_schemas() * Coco deprecation message *Fixed* * Fixed error reporting for client.create\_project() * Do not retry http 422 errors *Notebooks* * Send\_to\_annotate\_from\_catalog functionalities outside Foundry *Fixed in Notebooks* * Fixed meta notebook * Modified queue\_management.ipynb to remove some parameters * Update\_huggingface.ipynb ## Python SDK *Added* * Dark mode (sun/moon icon on the top right-hand side) *Fixed* * All example notebooks from our SDK repository are now in the [Python tutorials](/page/tutorials) tab Read More **Release notes** ## App *Added* * Our new warehouse integration p rovides an easy, no-code solution to keep your datasets in Labelbox in sync with the tables in your data warehouse. You can use this integration to connect over 25 different data sources, including Big Query, Databricks, Snowflake, and Google Sheets. To learn more, see our [Census integration](/docs/census-integration) docs. * You can use the new fine-tuning capability (beta) to fine-tune a YoloV8 object detection model with custom features. To run fine-tuning, you'll need to create a model experiment, import data rows and ground truth, and add an ontology to use for fine-tuning. Then, select the "fine-tune model" button to configure the training job. Docs on this feature are coming soon. * You can now include custom and auto-generated embeddings when you export your data rows via Export v2. You can find an example of exported embeddings on this page: [Export image annotations](/reference/export-image-annotations#sample-project-export). * When you create an issue, you now have the option to assign the issue to a category. Issue categories can be viewed and managed in the Issues tab in a project. To learn more, see our docs on [Issues](/docs/issues-comments#create-an-issue). * The following model + annotation types are now supported in Foundry: * Video bounding box detection via GroundingDINO model * Video segmentation mask detection via GroundingDINO + SAM model * Video frame-based classification via Google Gemini 1.5 Pro (Beta) model * Video global classification via X-Clip, Gemini Pro Vision, Gemini 1.5 Pro (Beta) models * We introduced new ways to create model experiments in the UI: 1) From the Manage selection dropdown in Catalog, you can send selected data rows to a new experiment or an existing experiment. 2) From Model, you can now click Create -> Experiment to create a new experiment. 3) From an existing experiment, you can now click + New Model Run to append data rows to the experiment. * LlaMa v3 is now supported in Foundry. *Changed* * Our updated home page provides you with key actions, featured reads, and entry points to make it easier to get started on projects. * For projects that are configured with Boost Workforce Express, labeler instructions are now added as part of the ontology. We also enabled hyperlinking in the ontology Instructions to allow you to link a Google doc as labeler instructions. * With the new Catalog search experience, you can quickly search your data in Catalog by typing into the data search filter bar. You no longer need to specify a filter to find the data you are looking for. See [Filters](/docs/search#data-search) for more details. * The rate limit for Gemini 1.5 Pro increased from 5 to 600 requests per minute for Labelbox Foundry. *Removed* * On April 19, we sunsetted Export v1 for free, edu, and starter customers. To learn more about this deprecation, see our [migration guide](/docs/migration-guide-export-v1-to-export-v2). *Future* * On May 15, we will begin sunsetting Export v1 for pro, standard, and enterprise customers. To learn more, see our [migration guide](/docs/migration-guide-export-v1-to-export-v2). * On June 30, we will be sunsetting the custom editor for all customers. ## Python SDK The latest version of our Python SDK is v3.69.1. See our [full changelog on GitHub](https://github.com/Labelbox/labelbox-python/blob/develop/libs/labelbox/CHANGELOG.md) for more details on what was added recently. ### Version 3.69.1 (2024-05-01) Fixed- Fixed a bug with certain types of content not being returned as a result of `ExportTask.result` or `ExportTask.errors` ### Version 3.69.0 (2024-04-25) *Added* * Support to export embeddings from the SDK *Fixed* * Used OpenCV's headless library in replacement of OpenCV's default library ### Version 3.68.0 (2024-04-16) *Added* * Added support for embeddings. * Introduced the use of 'rye' as a package manager for SDK contributors. * Implemented a unified `create` method for `AnnotationImport`, `MEAPredictionImport`, and `MALPredictionImport`. * Enhanced annotation upload functionality to accept data row IDs, global keys, or external IDs directly for `labelbox.data.annotation_types.label` *Fixed* * Ensure items in `dataset.upsert_data_rows` are not empty * Streamable export fix to report export\_v2 errors as list of dictionaries, compatible with older releases ### Version 3.67.0 (2024-04-05) *Added* * Added `SECURITY.md` file * Made export\_v2 methods use streamable backend * Added support for custom embeddings to dataset *create data row(s)* methods * Added ability to upsert data rows via `dataset.upsert_data_rows()` method * Added `AssetAttachment` with an ability to `update()` and `delete()` *Updated* * Added check for 5000 labels per annotation per data row *Fixed* * Errors and Failed data rows are included in the `task.result` for `dataset.create_data_rows()` * Fixed 500 error handling and reporting *Notebooks* * Updated import notebook for image data * Added attachment PDF example, removed requirements around text\_layer\_url * Included the `get_catalog()` method to the export notebook * Added workflow status filter to export\_data notebook for projects * Send predictions to a project demo * Removed model diagnostic notebooks Read More **Release notes** ## App *Added* * You can now filter and sort by custom metrics at the prediction level in the Model runs view. This new filter capability supports autogenerated metrics as well as custom metrics at the prediction level. To learn more, see [Filters](/docs/filtering-and-sorting#filtering-on-custom-metrics-per-prediction-basis). * When you enter the Schema tab, you'll see a new tab for Embeddings. In this part of the app, you can view all of your autogenerated and custom embeddings for your organization. Here, you can also create custom embeddings via the UI. See [Embeddings](/docs/embeddings) to learn more about the Embeddings subtab. * You can now add PDFs as attachments to data rows via the `PDF_URL` attachment type. To view an example of importing a PDF as an attachment, see [Attachments](/reference/attachments). * If you are a new user and do not have any data in the Catalog yet, Labelbox will show a zero-state screen with guidance on how to add data rows. * If you are a new user and do not have any projects in Annotate yet, Labelbox will show a zero-state screen with guidance on how to create your first project. * Foundry now includes support for Claude 3 Opus and Google Gemini 1.5 Pro. * The new inferencing endpoints (beta) enable you to run any Foundry app to generate predictions on your data via a REST API. You can use this REST API to 1) pass raw data without creating data rows or a dataset, 2) pass raw data and specify which dataset to create the data row, or 3) pass a specific data row that has already been created in Labelbox. To learn more, see our docs on [Foundry apps](/docs/run-foundry-apps#run-app-using-rest-api). *Changed* * When you import PDFs as data rows, you no longer need to specify the `text_layer_url` as Labelbox now automatically generates the text layers when you import PDF assets. This means you no longer need to create the text layers and include them when importing PDF assets. * We now render the image overlay attachments in the order they were created. This means they are no longer randomly ordered when you view them in the editor. * Now when you click on a relationship edge in the editor, the relationship will be selected in the objects panel. If the relationship has a subclassification, the subclassification will open in the left panel. This improvement impacts the text and conversational text editors. To learn more, see our docs on Text relationships. *Fixed* * The issue where data rows were stuck in the "To label" queue has been resolved. Now data rows that have been labeled or skipped should be moved to the next workflow step, reliably. * The issue causing some assets to not load properly in the data row browser has been resolved. * We implemented a fix to ensure feature names with underscores are automatically formatted with surrounding spaces for accurate model processing. *Future* * On April 20, 2024, we will sunset Export v1 for Free, EDU, and Starter tiers. On April 27, 2024, we will sunset Export v1 for Standard, Pro, and Enterprise tiers. To learn more, see the [Export v1 to v2 migration guide](/docs/migration-guide-export-v1-to-export-v2). * On Jun 30, 2024, we will be sunsetting our custom editor for all customers. ## Python SDK The latest version of our Python SDK is v3.66.0. See our [full changelog on GitHub](https://github.com/Labelbox/labelbox-python/blob/master/CHANGELOG.md#version-366-2024-03-20) for more details on what was added recently. ### Version 3.66.0 (2024-03-20) *Added* * Added support for Python 3.11, 3.12 * Added update method to attachments *Notebooks* * Improved notebooks for integration and model diagnostics * Removed databricks integrations notebooks *Updated* * Updated README for clarity and contribution guidelines *Removed* * Removed support Python 3.7 as it has been end of life since June 2023 Read More **Release notes** ## App *Added* * Smart select (beta) in Catalog offers three different ways to curate your data within a dataset: random, ordered, and cluster-based. To access this feature, go to Catalog, select a dataset that is larger than 100 data rows, and select "Smart select". * Cluster view (beta) in Catalog is a new interface that helps you discover and find similar data rows as well as labeling or model mistakes. See [Cluster view](/docs/cluster-view) for more details. * Natural language search and similarity search in Catalog now support video and audio assets. * You can now 1) import custom metrics per prediction and 2) filter/sort by prediction-level custom metrics. These added functionalities make it easier to use our model error analysis tools in the Model product. To learn more about prediction-level metrics on model runs, see [Filters](/docs/filtering-and-sorting). * You can now invoke Foundry models from within a project to generate pre-labels. In the project overview, you'll see "suggested models" to help you select the best model for generating predictions. This feature is in beta. * The new Audit trail feature provides a way for you to view all operations (create, update, and delete) on an annotation. This feature makes monitoring changes to annotations over time easier for auditing purposes. * The new enhanced video player (beta) in Catalog offers new capabilities to help you curate, evaluate, and visualize video data. These new capabilities include basic play/pause controls, support for predictions, pre-labels, ground truth (bounding boxes, polylines, and points), and support for comparing model runs. * If you export from an image or video labeling project that contains data rows labeled with segmentation masks, you will see a new property in the export file called composite\_mask. Composite masks are a grouping of all segmentation masks on a label. To learn more, see [Export image annotations](/reference/export-image-annotations#mask). * If you export from a model run that contains segmentation mask predictions, you will see a new property in the export file called composite\_mask. To learn more, see [Export image annotations](/reference/export-image-annotations#mask). * You can now have a workflow filter based on consensus average. For example, you can use this filter to send data rows with a consensus score over 80% straight to "Done". To do this, create a new workflow step and select consensus average from the workflow filters. * Boost express makes it easier for you to request help labeling data. Once you've set up a project, you can request up to fifteen labelers to work on your data. See our docs on [Boost express](/docs/labeling-services) to learn more. * The billing tab in your Workspace settings now has a button that enables you to easily unsubscribe from Foundry. *Changed* * When you select a project in Annotate, you will see a new project overview UI that makes navigating your project easier. * We have disabled the ability to download videos from the preview mode for all users. * When you are in a project, you will now see a "Start" button with a dropdown menu containing options for labeling, reviewing, and any other tasks you have configured for your project. *Fixed* * Google SSO users that have a profile picture on their account can now log in as expected. * The issue causing large numbers of data rows to be stuck in the “To Label” task has been fixed. * The Enterprise dashboard now displays the correct information for your organization. * In the Data row details panel, the row data can no longer be displayed with the security token. *Future* * Starting in mid-April, we will sunset export v1 on a rolling basis. To view the sunset date for your tier, see the [Export v1 to v2 migration guide](/docs/migration-guide-export-v1-to-export-v2). ## Python SDK The latest version of our Python SDK is v3.65.0. See our [full changelog on GitHub](https://github.com/Labelbox/labelbox-python/blob/master/CHANGELOG.md#version-3650-2024-03-05) for more details on what was added recently. ### Version 3.65.0 (2024-03-05) *Notes* * Rerelease of 3.64.0 ### Version 3.64.0 (2024-02-29) *Added* * `Client.get_catalog` Add catalog schema class. Catalog exports can now be made without creating a slice first * `last_activity_at` filter added to export\_v2, allowing users to specify a datetime window without a slice *Removed* * Review related WebhookDataSource topics *Notebooks* * Added get\_catalog notebook * Update custom metrics notebook * Update notebooks for video and image annotation import ### Version 3.63.0 (2024-02-19) *Added* * Ability for users to install and use sdk with pydantic v.2. *while still maintaining support for pydantic v1.* * `ModelRun` `export()` and `export_v2()` add model\_run\_details to support splits *Notebooks* * Add composite mask notebook ### Version 3.62.0 (2024-02-12) *Added* * Support custom metrics for predictions (all applicable annotation classes) * `FoundryClient.run_app` Add data\_row identifier validation for running foundry app * `Client.get_error_status_code` Default to 500 error if a server error is unparseable instead of throwing an exception *Updated* * `DataRowMetadata, DataRowMetadataBatchResponse, _UpsertBatchDataRowMetadata` Make data\_row\_id and global\_key optional in all schema types *Fixed* * `ExportTask.__str__` Fix returned type in ExportTask instance representation *Removed* * `Project.upsert_review_queue` *Notebooks* * Update notebooks to new export methods * Add model slice notebook * Added support for annotation import with img bytes Read More **Release notes** ## App *Added* * When you select a project in Annotate, you will now see a count for “Skipped” data rows in the project overview. * The text search filter in Catalog now supports text attachments. * The new Cluster view (beta) in Catalog is a visual tool that allows you to explore relationships between data rows, identify edge cases and outliers, select for pre-labeling or human review, and quickly classify large datasets in bulk. Read our docs on [Cluster view (beta)](/docs/cluster-view) to learn more. * The new Cloud bucket sync (beta) provides a simple way for you to sync the data in your cloud buckets to Labelbox (supported for AWS S3 buckets, Google Cloud Storage, and Microsoft Azure Blob Storage). You must have IAM delegated access configured in order to use the cloud bucket sync (beta) in Catalog. * Foundry now has the following improvements: * GPT4v will now support PDF files * Gemini Pro Vision supports video classification use cases * Improved speed for preview mode for GPT-like models and llava\llama * Increased the speed on the submit job mode *Changed* * We removed the technical limitation that prevented customers from exporting over 700K data rows at a time. Read our docs on [streamable exports](/reference/export-overview#streamable-exports) and [Limits](/docs/limits#export) to learn more about the new export limits. * Our new flat-rate LBU pricing system makes understanding your usage in Labelbox easier. To learn more about the recent LBU pricing changes, see our [Manage account and billing](/docs/billing) docs. * When you select a project in Annotate, you will see a new project overview UI that shows labeling progress, workflow tasks, and other key configurations for your project. * We made some improvements to increase the reliability of routing data rows within Annotate workflows. This improvement mostly helps large-scale organizations that rely heavily on review workflows and workflow history for their operations. *Fixed* * When you update the metadata field on a data row, the change should be reflected in the “Last activity” filter in Catalog. * Creating multiple relationships between points and other vector annotations in the editor now works as expected. * The issue where tracked bounding boxes appear outside the editor canvas has been fixed. * The issue of the editor not allowing you to unhide a single annotation when the whole annotation group is hidden has been fixed. You can now hide/unhide individual annotations as expected. * The hotkeys for zooming in and out of an asset in the editor have been fixed. * Now, when a user navigates back to a skipped data row in the editor, the user can’t navigate to the next asset without filling in all required fields or discarding the changes. * The sequence of undoing/redoing annotations and then creating new annotations now works as expected. *Future* * Starting in April, we will sunset export v1 on a rolling basis. To view the sunset date for your tier, see the [Export v1 to v2 migration guide](/docs/migration-guide-export-v1-to-export-v2). # Python SDK The latest version of our Python SDK is v3.61.2. See our [full changelog on GitHub](https://github.com/Labelbox/labelbox-python/blob/master/CHANGELOG.md#version-3612-2024-01-29) for more details on what was added recently. ### Version 3.61.2 (2024-01-29) *Added* * `ModelSlice.get_data_row_identifiers` for Foundry data rows *Fixed* * `ModelSlice.get_data_row_identifiers` scoping by model run id ### Version 3.61.1 (2024-01-25) *Fixed* * Removed export API limit (5000) ### Version 3.61.0 (2024-01-22) *Added* * `ModelSlice.get_data_row_identifiers` * Fetches all data row ids and global keys for the model slice * NOTE Foundry model slices are note supported yet *Updated* * Updated exports v1 deprecation date to April 30th, 2024 * Remove `streamable` param from export\_v2 methods ### Version 3.60.0 (2024-01-17) *Added* * Get resource tags from a project * Method to CatalogSlice to get data row identifiers (both uids and global keys) * Added deprecation notice for the `upsert_review_queue` method in project *Notebooks* * Update notebook for Project `move_data_rows_to_task_queue` * Added notebook for model foundry * Added notebook for migrating from Exports V1 to V2 ### Version 3.59.0 (2024-01-05) *Added* * Support `set_labeling_parameter_overrides` for global keys * Support `bulk_delete` of data row metadata for global keys * Support `bulk_export` of data row metadata for global keys *Fixed* * Stop overwriting class annotations on prediction upload * Prevent users from uploading video annotations over the API limit (5000) * Make description optional for foundry app *Notebooks* * Update notebooks for Project `set_labeling_parameter_overrides` add support for global keys Read More **Release notes** ## App *Added* * Foundry is now available to all users. Foundry enables you to use foundational models to predict annotations for your data and to create model runs to compare and diagnose the behavior of different models against your data and requirements. To learn more, read our [docs](/docs/foundry). * In the Catalog, there is now an easy way to send predictions straight to Annotate. Once you’ve generated predictions with Foundry, you can send them to Annotate either as prelabels or as labels for human review, either as prelabels or as annotations associated with a specific labeling step. To learn more, read these [docs](/docs/foundry-annotate-predictions). * Catalog similarity search now works on up to 1 billion data rows at a time. * We've updated our Create dataset workflow to highlight the following import options: Python SDK script, Cloud buckets integration, Data warehouse sync, or local file upload. Go to Catalog and select +New to see the new import flow. *Fixed* * When you navigate to Model --> Create --> Experiment, the Experiment page now opens as expected. * For ontologies containing required radio and/or checklist per message classifications, if a user attempts to remove the classification answer, they will be shown an error message and the classification value will be restored. * In the Data Rows tab, if a user inserts a value for Media attribute --> Attribute value --> Duration, the filter will now be applied correctly and the correct results will be shown. * Undo/redo behavior in the editor is limited to the current asset only. * When an Issue has resolved status, the Resolve button now has “Reopen” text instead, as it reopens a resolved Issue. ## Python SDK The latest version of our Python SDK is v3.58.1. See our [full changelog](https://github.com/Labelbox/labelbox-python/blob/master/CHANGELOG.md#version-3581-2023-12-15) in GitHub for more details on what was added recently. ### Version 3.58.1 (2023-12-15) *Added* * Support to export all projects and all model runs to `export_v2` for a `dataset` and a `slice` *Notebooks* * Update exports v2 notebook to include methods that return `ExportTask` ### Version 3.58.0 (2023-12-11) *Added* * `ontology_id` to the model app instantiation * LLM data generation label types * `run_foundry_app` to support running model foundry apps * Two methods for sending data rows to any workflow task in a project, that can also include predictions from a model run, or annotations from a different project *Fixed* * Documentation index for identifiables *Removed* * Project.datasets and Datasets.projects methods as they have been deprecated *Notebooks* * Added note books for Human labeling(GT/MAL/MEA) + data generation (GT/MAL) * Remove relationship annotations from text and conversational imports Read More # Subscribe to the Labelbox release notes Source: https://docs.labelbox.com/changelog/subscribe To subscribe to the Labelbox release notes, please fill out the [subscription form](https://learn.labelbox.com/releasenotessubscription/). # Access, storage & security Source: https://docs.labelbox.com/docs/access-storage Information on account and data accessibility, storage, and security. This guide provides information on your options for storing data when using Labelbox, as well as details on our security and privacy measures. ## Data storage configurations You have three options for connecting your data to Labelbox. The method you choose determines where your assets are stored. * **IAM delegated access (recommended)**: This is the recommended method for connecting your data to Labelbox. It allows you to host your data in your own cloud storage ([AWS S3](/docs/import-aws-s3-data), [Google Cloud Storage](/docs/using-google-cloud-storage), or [Azure Blob Storage](/docs/microsoft-azure-blob-storage)) and grant Labelbox access using native Identity and Access Management (IAM) roles. * **Pre-signed URLs**: You can create a JSON file containing pre-signed or public URLs that point to files in your own cloud storage. You are responsible for [generating these signed URLs](/docs/signed-urls). Once you have the JSON file, you can upload it to Labelbox. * **Direct file upload**: If you choose to upload your data directly, it will be stored in private buckets on Labelbox's Google Cloud Services. Labelbox will have access to your data to generate signed URLs for rendering assets in the browser. The standard expiration value for these signed URLs is one day. **A note on data location and performance** To ensure the best performance and fastest loading times for your labeling team, we recommend that you host your cloud data in a location that is geographically close to your team members. Labelbox uses Google Cloud Services for cloud storage and all data is stored in the US. If your data is hosted on our servers, we use a CDN that provides geo-loading as close to the location as possible. Labelbox does not store data in the EU and will not do so on request. ## Security and compliance We are committed to ensuring that your data is secure and that we meet our obligations under privacy and security laws. **Data encryption**\ All data hosted by Labelbox, including labeled data, assets, and private user information, is encrypted at rest using AES-256. We use Google Cloud for storage, which means your data is also encrypted on the server-side using GCP's default encryption keys. Data is automatically decrypted when read by an authorized user. **User authentication**\ To ensure that only authorized users can access your data, we use Auth0 for authentication. We also support [multi-factor authentication (MFA)](/docs/multifactor-authentication) and [Single Sign-On (SSO)](/docs/single-sign-on) for Enterprise accounts. **Regulatory compliance**\ Labelbox has a comprehensive privacy program to meet our obligations under regulations such as [CCPA](https://en.wikipedia.org/wiki/California_Consumer_Privacy_Act), [GDPR](https://en.wikipedia.org/wiki/General_Data_Protection_Regulation), [SOC 2](https://en.wikipedia.org/wiki/System_and_Organization_Controls) Type II, and [HIPAA](https://en.wikipedia.org/wiki/Health_Insurance_Portability_and_Accountability_Act). For more details, please see our [Privacy FAQ](https://labelbox.com/company/privacy-faq). ## Data privacy and portability **Data usage**\ Labelbox does not sell customer or end-user personal data. We share personal data with third-party service providers for business purposes only, but we do not share this information for monetary value or other valuable consideration. For more information, please see our [Privacy Notice](/page/privacy-notice). **Data portability**\ Upon request, we can export your data and provide it to you. We can also permanently delete all of your data from our servers. ## More information # AI critic Source: https://docs.labelbox.com/docs/ai-critic AI Critic helps you ensure the quality of your data by allowing you to define validation rules using natural language. The AI is contextually aware of all elements on the data row—including messages, features, and rubrics—enabling you to create highly specific critiques. AI critic is supported for the Multi-modal chat editor, the Audio editor, and the Video editor. ## How to create a new critic 1. Go to your **Project settings** → **Advanced** and enable **AI critic.** You can also choose which Foundry model to power the AI critic by setting the **AI critic model**. 2. In the editor, select the **AI critic setup** icon in the top navbar. Then, click **Add new critic**. 3. Add a short descriptive **title** for your critic. Then, in the text box, **describe your critic** using natural language. Define which elements you want to critique and the specific criteria they must meet. The more specific your description, the better the critic performs. For the best results, make sure your prompt references these element names: | Element | Description | | ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Actor | A participant in the conversation - either a "human" (the prompter) or a "model" (an AI responder). Each actor has metadata like a display name or model configuration name. | | Checklist classification | A multi-select classification where labelers choose one or more options from a list. Like radios, options can contain nested sub-questions. | | Display rules | Conditional visibility logic that controls when a classification appears. A criterion with display rules only shows when other criteria have specific values, enabling branching evaluation workflows. | | Feature | The actual value a labeler selected or entered for a classification. Contains the answer data, including which option was chosen (for radio/checklist) or what text was entered. | | Feature schema | The definition or blueprint for a classification. Describes its type (radio, checkbox, text), available options, scope, and display rules. Think of it as the question template. | | Index data | Binds a scoped classification to a specific element in the conversation. For example, a response-message-scoped classification's index data contains the ID of the response it applies to. | | Prompt message | A message sent by the human participant that initiates a conversation turn. In multi-turn conversations, each new prompt builds on the previous exchange. | | Radio classification | A single-select classification where labelers choose exactly one option from a list. Options can have nested sub-questions that appear when selected. | | Response message | A message generated by an AI model in reply to a prompt. In multi-model evaluation, a single prompt can have multiple responses from different models. | | Rubric criterion | A structured evaluation question used to assess model responses. Can be a radio (single-select), checkbox (multi-select), or text (free-form) question. Each criterion is associated with a specific prompt message and evaluates the responses to that prompt. | | Rubric group | A container that organizes related rubric criteria under a common header. Groups can have min/max constraints on how many criteria they contain. | | Scope | Determines where a classification applies. "Global" applies to the entire conversation. Other scopes target a specific prompt message, response message, rubric criterion, or turn. | | Text classification | A free-form text input annotation where labelers enter arbitrary text. | | Turn | A complete prompt-response exchange in the conversation. Each turn contains one prompt message and one or more response messages from model actors. | 5. Set the **Enforcment**. This setting determines what happens when a critique fails. Select from the following options. * **Block**: Prevents the user from submitting the label until the failing critique is resolved. This is the strictest quality-control option. * **Acknowledge**: When the user clicks submit, a notification appears detailing the quality failures. The user must acknowledge the message to proceed, but they are not blocked from submitting. * **None**: Displays the critique result as a visual aid directly in the editor but does not interfere with the submission process. 6. Click **Update Critic** to save it. The AI will immediately begin critiquing elements based on your new rule. ## How to write effective critiques You can create critiques for various elements within the editor. The AI understands the context and relationships between different elements, such as a classification and the message it is attached to. To verify that a classification's value is correct based on its label and the message it's associated with, you can use a prompt like this: `Look at the instructions or label for each classification. Make sure the value for the classification is correct based on its label and all of the contextual information. If it's a scoped classification, make sure to read the associated message in order to make your critique.` The AI critic will then: 1. Target all classifications in the editor. 2. Read the classification's label. 3. If the classification is attached to a message, it will read the message content. 4. Determine if the selected value is appropriate based on the context. ### Example prompts * *"All responses must be at least 3 sentences long and written in formal English. Responses that are too short or use casual language should fail."* * *"Each response must directly address the question asked in the prompt message. If the response is off-topic or does not answer the question, it should fail."* * *"For all checklist classifications, ensure that no contradictory options have been selected together."* * *"For all radio classifications scoped to response messages, verify the selected option accurately reflects the tone of the response."* * *"Review the entire conversation holistically. The overall exchange must feel natural and coherent. Flag data rows where the conversation feels disjointed or artificially constructed."* * *Ensure that no personally identifiable information (PII) appears anywhere in the conversation — in prompts, responses, or classifications."* ### Important considerations **Be specific**, vague prompts produce inconsistent results. **Use standard vocabulary** — terms like "turn," "rubric criterion," "global scope," "response message scope" help the AI target the right elements. **Experiment** with your prompts — writing a good critic prompt is an art; test different phrasings and check results in the editor to find what works. # Alignerr Source: https://docs.labelbox.com/docs/alignerr # Overview Source: https://docs.labelbox.com/docs/annotate-overview Collaboratively annotate data with the internal team, your own vendor, or Labelbox data labeling service. Annotate is the data labeling platform within Labelbox. It allows your organization to label data with any human workforce at any scale. ## How Annotate works When it comes to deciding how to label your data, you have the following options: * Outsource this task to a labeling service — these external teams receive training on the specific labeling tasks required and quickly proceed to label large datasets (see [Workforce](/docs/labeling-services)). * Import your model predictions as pre-labels to speed up the labeling process (see [Model-assisted labeling](/docs/model-assisted-labeling). * Rely on your own internal team of labelers to label your dataset (see [Labeling editors](/docs/label-data)). Regardless of your labeling method, Labelbox Annotate is a central place where you can manage all your labeling projects, customize your labeling & quality workflows, and monitor your labeling team's performance. ## Customizable labeling editor The editor is the labeling interface purposefully designed to be highly configurable. The editor is the primary tool for creating, viewing, and editing annotations. The labeling editor supports the following media types out of the box: ## Model Assisted Labeling with Foundry Users can use Foundry to add tools or classifications onto data rows as an auto-labeling feature. This is set up when a project has data to label and an editor with a set ontology. Once completed, users can enable the Model Assisted Labeling tool and configure an LLM to create annotations. ## Customizable labeling & review settings You can set up customized review steps based on your decided quality strategy in your project's [Workflow](/docs/workflows) tab. As you work with large, complex projects, having to review all labeled data rows becomes increasingly time-consuming and expensive. You can leverage workflows to create a highly-customizable, step-by-step review pipeline to drive efficiency and automation into your review process. To learn more, read [Workflows](/docs/workflows). Additionally, you can set leverage the following quality tools: ## Labeling workforce Powered by Labelbox’s data engine, you can leverage Labelbox labeling services to collaborate with an external labeling workforce in real time and produce high-quality data while leveraging AI and automation techniques to keep human labeling costs at a minimum. To learn more, see [Labeling services](/docs/labeling-services) ## Getting started with Annotate # Send project updates Source: https://docs.labelbox.com/docs/annotate-project-updates Learn how to send mass communication to members in your organization or external orgs with project updates. Project updates in Labelbox are a great way to keep everyone on your team informed. This feature allows you to communicate and collaborate directly within the Labelbox platform, which is especially helpful when working with an external workforce. You can find the **Updates** section in the **Notifications** tab. This is where you can coordinate with all stakeholders, including your team, a labeling service admin, and the labeling workforce. To keep things organized, Labelbox allows you to create different types of updates for common scenarios. This makes it easier to find important information later. ## How to create an update Follow these steps to create a project update: 1. **Open a new update**: Select a project and go to **Notifications** → **Updates**. Click **+ New update**. 2. **Select the update type**: Select the type of update from the dropdown menu and add a message to provide more details. The text field for creating and editing updates supports markdown, allowing you to easily format your text. Here are the types of updates you can choose from: | Update type | What it does | | -------------------- | ---------------------------------------------------------------------------- | | Progress update | Shares updates on the project's progress. | | ETA updated | Communicates any changes to the project's expected completion date. | | Instructions updated | Notifies the labeling team of any changes to the project instructions. | | Issue | Informs the labeling team about any labeling issues you find in the project. | | General | For any communication that doesn't fit into the other categories. | 3. **Add a message**: You may optionally add a message to make your update more descriptive. Markdown formatting is supported in your message description. 4. **Share with**: You can specify which organizations will receive this update. * **All**: This will make it available for all organizations added to the project (it will enable this update for external organizations added in the future) * **Internal only**: This will always make it private for your organization, meaning no other organizations (even those added later) will have access to this update. * **Specific organization**: You can select specific organizations that will have access to it (your organization is always included). This option is only available if your project is shared with external orgs. 5. **Notify with emails**: To support engagement, you may select to notify users of the update via email. Check the box to enable email notifications. 6. **Add attachments**: You may optionally include local files in your update. Click **Add** under **Attachments** to do so. 7. When you are done, click **Create update**. For easy retrieval, you can filter your updates in the **Notifications** tab with two filters: * **Author:** filters by the email address of the party creating the update * **Type:** filters by the update type Once a new update is created, you can reply to the same update. This will add it as an additional reply to the thread. You can reply to updates to close the thread or ask additional clarifying questions. ## Automatic email notifications Once you are added as a project admin, you will automatically receive email notifications for every new update. These emails will come from [support@labelbox.com](mailto:support@labelbox.com), and the subject line will include the project name, making it easy to sort notifications from different projects into unique email threads. # Audio Source: https://docs.labelbox.com/docs/audio-editor Guide for labeling audio data. With the audio editor, you can add annotations to audio files, like classifying natural language conversations and music, to train conversational AI and audio-based ML models. The editor supports automatic speech-to-text recognition with the Whisper model, enabling you to transcribe any audio segment. ## Set up audio annotation projects To set up an audio annotation project: 1. [Create an audio dataset](/docs/datasets-datarows). 2. On the [Annotate projects page](https://app.labelbox.com/projects), click the **+ New project** button. 3. Select **Audio**. Provide a **name** and an optional **description** for your project. 4. Click **Save**. The system then creates the project and redirects you to the project overview page. 5. Click **Add data**. Then select your audio dataset. Click **Sample** to sample your dataset, or you can manually select data rows and click **Queue batch**. ### Data row size limit To view the maximum size allowed for a data row, see [limits](/docs/limits). ## Set up ontologies After setting up an audio annotation project, you can [add an ontology](/docs/labelbox-ontology#create-a-new-ontology) based on how you want to label the data. The audio editor supports the following annotation types that you can include in your ontology: | Feature | Import annotations | Export annotations | | --------------------------------- | ---------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- | | **Radio classification** | [See payload](/reference/import-audio-annotations#radio-single-choice) | [See payload](/reference/export-audio-annotations#classification---radio) | | **Checklist classification** | [See payload](/reference/import-audio-annotations#checklist-multiple-choice) | [See payload](/reference/export-audio-annotations#classification---checklist) | | **Free-form text classification** | [See payload](/reference/import-audio-annotations#free-form-text) | [See payload](/reference/export-audio-annotations#classification---free-form-text) | ### Classification scopes You can apply classifications as **global classifications** at the file level, **temporal classifications** at the frame level, or **nested classifications** under other annotations. ## Use the audio editor After adding data and setting up an ontology for your audio annotation project, you can add labels to data rows using the audio editor. Each data row displays in the editor with: * **A waveform** visualizing pattern of sound pressure variation. * **A spectrogram** showing the range of sound frequencies and their strengths over time. * **A timeline** of audio split into 500 millisecond intervals by default and a **Timeline Resolution** slider that allows you to adjust the time intervals on the timeline. * **Basic player controls**, such as the play/pause button, back/forward 10-second buttons, and the playback speed. You can also click anywhere on the waveform to instantly move to your desired location. To add a global classification. select the classification and enter the value. To add a temporal classification, select the classification, choose the interval on the timeline or waveform for when the classification starts, and add the classification value. You will see a circle representing the classification value on the timeline. ### Timeline resolution differences for labels If you set a lower resolution with the timeline resolution slider, the classification label you add may not align exactly with the current timeline resolution. This indicates that the classification was placed at a timestamp with a higher resolution than the one currently being used. You can adjust the timeline resolution to a higher resolution to see the exact position of the classification. ### Enable speech recognition The in-editor automatic speech-to-text support allows you to recognize and extract text from audio segments using free-form text classifications. To enable it: 1. When [creating the ontology](/docs/labelbox-ontology#step-1-create), add a temporal **Text** classification feature. 2. Select the temporal text classification, and then select a starting frame on the timeline. 3. Click **START TRANSCRIBING**. 4. Click **END TRANSCRIBING** at your desired ending frame. If the system detects speech, it automatically generates a transcript in the text annotation. ### Keyboard shortcuts | Function | Hotkey | Description | | -------------------------------- | ----------------- | ------------------------------------------------- | | Play/Pause | `Space` | Play or pause the audio playback | | Move backward one frame | `←` | Move backward one frame | | Move forward one frame | `→` | Move forward one frame | | Select frames | `Shift` + `Mouse` | Select frames for adding temporal classifications | | Advance to the previous keyframe | `⇧` + `←` | Advance to the previous keyframe | | Advance to the next keyframe | `⇧` + `→` | Advance to the next keyframe | | Jump to objects | `Down` | Jump to objects | | Next object | `Down` | Move to the next object | | Previous object | `Up` | Move to the previous object | | Toggle | `⌘` + `/` | Toggle the keyboard shortcuts menu | # Monitor activity with cloud audit logs Source: https://docs.labelbox.com/docs/audit-logs How to configure data access audit logs. This guide demonstrates how to use your cloud provider's native logging tools to get full transparency into how and when Labelbox accesses your data via IAM Delegated Access. By monitoring these logs, you can maintain a complete, unchangeable record of every interaction, which is essential for security compliance and internal audits. ## Gain visibility through logs When you grant Labelbox access using the IAM Delegated Access method, every action our platform takes is authenticated by your cloud provider. Every time Labelbox temporarily assumes the IAM role to access a file for labeling, your cloud provider records that event in an audit log. These logs answer critical security questions: * **Who** made the request? (The specific Labelbox IAM role) * **What** resource was accessed? (The specific file/object in your bucket) * **When** did the access occur? (A precise timestamp) * **From where** was the request made? (The source IP address, which will be one of Labelbox's servers) By analyzing these logs, you can have complete confidence that Labelbox is only accessing data as needed for your users to perform their labeling tasks. ## Step-by-step monitoring guides One of the benefits of using IAM delegated access is that security-conscious teams can choose to access audit logs of when their raw data is read by Labelbox through their cloud provider. # Send a batch to a labeling project Source: https://docs.labelbox.com/docs/batches Instructions for sending a batch of data rows from Catalog to a labeling project via the app UI. In Labelbox, a "batch" is a group of data rows that you send from your Catalog to a project for labeling. Using batches gives you more control and flexibility over your labeling workflow. ## How to send a batch to a labeling project Follow these steps to send a group of data rows to a labeling project: 1. Navigate to the **Catalog** and use the filters to select the data rows you want to include in your batch. 2. Once you have selected the data rows, click the **Manage selection** dropdown menu and select **Send to Annotate**. 3. In the **Configure batch** window, you will need to: 1. Choose the project you want to send the batch to. 2. Set a priority for the batch. A priority of 1 is the highest and 5 is the lowest. This will determine the order the data rows are placed in the labeling queue. 3. Enable or disable **Consensus** for the batch. If you enable Consensus, you will need to set a **% Coverage** to indicate the percentage of data rows that will be queued for labeling by multiple labelers, and a **# Labels** value to indicate how many labels can be added to the data row. 4. Click **Submit**. Any data rows that have already been submitted to the labeling project will be excluded from the batch. **Limits** See [this page](/docs/limits) to learn the limits for sending batches to a project. ## How to create a batch by sampling Sampling is a useful technique for creating a batch when you are working with a large amount of data. Instead of manually selecting data rows, you can use sampling to create a batch of a specific size. 1. In the **Catalog**, use the filters to find the data rows you want to sample from. 2. Click the **Sample** button in the top right corner. 3. In the batch creation window, you can choose how many data rows to sample and the sampling method. Labelbox supports two sampling methods: * **Random**: This method will randomly select the specified number of data rows from your filtered results. * **Ordered**: This method will select the specified number of data rows based on the sorted order of the results. You can sort the results by the **Created At** timestamp in ascending or descending order. The sequence generated by ordered sampling does not influence the sequence of the data rows in the labeling queue. 4. Fill out the project, batch name, and priority details. 5. Click **Submit batch**. ## How to send predictions as pre-labels To speed up the labeling process, you can include model predictions as pre-labels in your batch. 1. In the batch creation window, toggle the **Include predictions** option on. 2. From the dropdown menu, select the model run and the specific predictions you want to include in the batch. 3. Confirm that the model run is compatible with the labeling project. For the predictions to be compatible, the features in the model run ontology and the labeling project ontology must be the same. If the model run and the labeling project do not share any features, you will not be able to send the predictions as pre-labels. 4. Click **Submit batch**. ## Important considerations * **Media type compatibility**: You can only send a batch to a project that has the same media type. For example, you cannot send a batch of text data to a project that is set up for labeling images. * **Appending to batches**: Once a batch has been sent to a project, you cannot add more data rows to it. * **Data rows in multiple batches**: A data row cannot be in more than one batch in the same project at the same time. * **Sharing batches**: A batch cannot be shared between projects. However, you can create a new batch with the same data rows and send it to a different project. # Benchmark Source: https://docs.labelbox.com/docs/benchmark Learn how to set up benchmarks to analyze the quality of your labels. Benchmarks serve as the gold standard for other labels. You can designate specific data rows with annotations as benchmarks, and all other annotations on these data rows are automatically compared to these benchmark reference labels to calculate benchmark agreement scores. ### Benchmarks in the queue Labelbox ensures that the first five data rows in your labeling queue are benchmark labels. After these initial five, the order of the remaining benchmarks is not guaranteed. Beyond the first five, there is a 10% chance that any subsequent data row you encounter will be a benchmark label. If you have fewer than five benchmarks, they are the first data rows in your queue, and no additional benchmarks appear unless you add more. ## Supported types Currently, benchmarking is supported for the following asset and annotation types: | Asset type | Bounding box | Polygon | Polyline | Point | Segmentation mask | Entity | Radio | Checklist | | ------------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | | Images | | | | | | N/A | | | | Videos | | - | | | - | N/A | | | | Audio | N/A | N/A | N/A | N/A | N/A | N/A | | | | Text | N/A | N/A | N/A | N/A | N/A | | | | | Tiled imagery | | | | | - | N/A | | | | Documents | | N/A | N/A | N/A | N/A | | | | | HTML | N/A | N/A | N/A | N/A | N/A | N/A | | | | Conversational text | N/A | N/A | N/A | N/A | N/A | | | | | Human-generated responses | N/A | N/A | N/A | N/A | N/A | N/A | | | ## Set up benchmarks To set an individual data row as a benchmark reference: 1. In the editor, label the data row and click **Submit**. 2. Navigate back to the project homepage. 3. Go to the **Data rows** tab. 4. Select the labeled data row from the list. This will open the Data row browser. 5. Click on the three dots next to the data row and select **Add as benchmark** from the dropdown. To bulk-assign labeled data rows as benchmark references: 1. On the **Data rows** tab, select the data rows that you want to set as benchmarks. 2. Click the **selected** dropdown and select **Assign labels as benchmarks**. 3. If you have permissions to access benchmark scores, you can select from the following two options: * **Infer automatically**: Set the labels created within the benchmark quality mode as benchmark references. If the project also uses consensus as the quality mode, this option also sets consensus winners as benchmark references. * **Target label by score**: Select or set a range of benchmark scores to set labeled data rows matching the filter as benchmark references. If you don't have permissions to access benchmark scores or if no benchmark scores are available, you can only see and use the **Infer automatically** option. 4. Click **Submit**. Once a label is designated as a benchmark, the data row is automatically moved to **Done**. Benchmarked data rows will be served to all labelers in a project. Benchmarked data rows can't be moved to any other step unless the benchmark is removed. For a benchmark agreement to be calculated, one benchmark-reference label and at least one non-benchmark label need to be on the data row. Whenever annotations are added or updated, the benchmark agreement is recalculated as long as at least one non-benchmark label exists on the data row. ## Search and filter data using benchmark scores The **Benchmark agreement** filter helps you find qualified data rows based on benchmark scores. You can apply this filter in the following locations: * The [Data Rows](/docs/data-rows-activity) tab * The [Workflow](/docs/workflows) tab * The [Catalog](/docs/search) page When using the filter, you can configure the following options: * **Scope**: Specify the type of agreement to measure: * **Feature-level** measures the alignment between annotators' labels and the predefined benchmark reference labels for each data row. If you select this option, further specify one or more feature schemas in the ontology using the dropdown menu. * * **Label-level** evaluates the overall agreement of all annotations within a single data row compared to the benchmark reference label. * **Calculation**: Choose whether to calculate the agreement as an absolute or average score. * **Range (0-1)**: Set the score range from 0 to 1, 0 indicates no agreement with the benchmark reference label and 1 represents complete agreement. # A guide to Labelbox's LBUs Source: https://docs.labelbox.com/docs/billing Learn how to perform common billing tasks and how Labelbox charges works. ## What are Labelbox (LBUs)? A Labelbox Unit (LBU) is a normalized unit of data in the Labelbox platform. Think of it as a credit that you consume as you use Labelbox's products. Each LBU represents a specific amount of work done, which can involve one or more data rows (e.g., images, text files, videos). Since each of Labelbox's products—Catalog, Annotate, and Model—is designed for different tasks, the way you consume LBUs will vary depending on the product you're using, the type and amount of data you're working with, and the specific actions you take. ## How LBU consumption is calculated The table below shows how many LBUs are consumed for different types of data across the three Labelbox products. | Data row asset type | Catalog (monthly) | Annotate (one-time) | Model (one-time) | | ----------------------------------------- | ----------------------------------------------- | ----------------------------------------- | ---------------------------------------------- | | Image | 1 LBU per 60 data rows | 1 LBU per data row | 1 LBU per 5 data rows | | Text | 1 LBU per 60 data rows | 1 LBU per data row | 1 LBU per 5 data rows | | Chat (Conversation) & Offline Multi Modal | 1 LBU per 60 data rows | 1 LBU per data row | 1 LBU per 5 data rows | | Audio | 1 LBU per 60 data rows | 1 LBU per data row | 1 LBU per 5 data rows | | Document (PDF), per page | 1 LBU per 60 data rows + 1 LBU per 60 pages | 1 LBU per page | 1 LBU per 5 pages | | Video | 1 LBU per 60 data rows + 1 LBU per 5,000 frames | 1 LBU per data row + 1 LBU per 150 frames | 1 LBU per 5 data rows + 1 LBU per 1,500 frames | | Geospatial & Medical Tiled Imagery | 1 LBU per 6 data rows | 4 LBU per data row | 1 LBU per data row | | Live Multi Modal/LLM | 1 LBU per 60 data rows | 20 LBU per data row | 1 LBU per 5 data rows | Free accounts have 500 free LBU credits each month. If you reach this limit, you can access and export your data. However, you cannot add data rows, labels, or predictions until the next billing period. (You can [**upgrade your account**](https://docs.labelbox.com/docs/update-account-plan#upgrade-current-plan) at any time.) ## How LBUs are consumed by each product Different actions within Labelbox consume LBUs. Here's how consumption works for each Labelbox product: ### Catalog consumption In **Catalog**, you consume LBUs when you upload and store data, regardless of the data type. To help you avoid accidental charges, there is a *seven-day grace period* that starts when you first upload data. If you remove any unused data rows before this grace period ends, you won't be charged any LBUs for them. However, if you take any action that adds a label to a data row (like annotating it), that row will consume LBUs, even if it's within the grace period. Once the seven-day grace period is over, all data rows in your Catalog will start consuming LBUs. Here’s an example: * You upload 600 images to a dataset. * For the first 7 days, no LBUs are consumed. You can delete any of these images without being charged. * If you label 60 of those images during the grace period, you will consume 1 Catalog LBU for the current billing period, even if you delete them later. * Once the grace period ends, the remaining 540 data rows will consume 9 LBUs for the current billing period. * If you then decide to delete 300 rows after the grace period, you will still be charged a total of 10 Catalog LBUs for the current billing period. In the next billing period, your LBU consumption will drop to 5 LBUs to reflect the new data row count. ### Annotate consumption In **Annotate**, you consume LBUs when you submit labels to a project. These are *one-time charges*. For videos, LBUs are consumed for the entire video as a single unit, no matter how many frames you label. If you label just one frame, the entire video is charged once. If you go back and label more frames in the same video later, you won't be charged any additional LBUs. **Skipped data rows** For Annotate projects, "skipped" data rows are considered labeled and will contribute to LBU consumption. ### Model consumption In **Model**, you consume LBUs when you do any of the following: * Add data rows to a model run. * Use Foundry to create and submit model predictions. * Use data rows for model-assisted labeling (MAL), either through the app or the SDK. These are *one-time charges* for each data row. Once a data row has been used in Model and consumed LBUs, you can use it again for other Model tasks without incurring additional LBU charges. However, new data rows will consume more Model LBUs. Be aware that Foundry may incur additional inference costs based on the number of rows processed, the complexity of the task, the model used, and other factors. To learn more, see the **Add-on charges** section below. ## Examples of LBU consumption Here are a couple of examples to illustrate how LBU consumption works in common workflows. **Example 1: Data labeling with model assisted pre-labels** Let's say you need to label 1,000 images, and you use Foundry to generate pre-labels for half of them (500 images). Here's how you would be charged: * **Catalog**: 17 LBUs per month to store the 1,000 data rows. * **Model**: 100 LBUs (one-time charge) for generating predictions on 500 of the data rows. * **Annotate**: 1,000 LBUs (one-time charge) for the 1,000 labeled data rows. If this activity occurred within a single billing period, you would be charged 1,117 LBU once and 1.67 LBU each month the data rows remain in Catalog. **Example 2: Data curation and labeling with enrichment** Now, imagine you have 60,000 images that all have Foundry predictions, and you end up labeling 10,000 of them. Here’s the breakdown: * **Catalog**: 1,000 LBUs per month to store the 60,000 data rows (or 12,000 LBU annually). * **Model**: 12,000 LBUs (one-time charge) for the 60,000 data rows with predictions. * **Annotate**: 10,000 LBUs (one-time charge) for the 10,000 labeled data rows. If this activity occurred within the same billing period, you would be charged 23,000 LBU once and 1,000 LBU each month the data remains in **Catalog**. ## Charges for add-on services Your account may also be billed for add-on services. These charges are billed separately from your subscriptions and may require you to add a credit card to your account. Available add-ons include: * **Foundry**: This service uses foundational models to predict labels and enrich your data. When you use Foundry, you will incur *inference costs*, which are based on the amount of data processed, the model used, and the complexity of the task. * **Labeling services**: You can connect with a team of professional labelers with specialized knowledge to annotate and review large amounts of data for you. * **Single sign-on (SSO)**: This allows Enterprise teams to use their company's single sign-on service to sign in to Labelbox. * **HIPAA compliance**: This helps Enterprise subscribers comply with the Health Insurance Portability and Accountability Act of 1996. If you are an Enterprise customer, you should contact your technical account manager to verify the terms, payment details, and other information about your add-ons. ## Where can I find more information? See [Plans and pricing](https://labelbox.com/pricing/) for current pricing, an LBU estimate calculator, and frequently asked questions. # Bulk classification for zero-shot learning Source: https://docs.labelbox.com/docs/bulk-classification You can use bulk classification in Labelbox to perform zero-shot learning. After curating a subset of data that shares common traits, you can classify them all at once, directly from the Catalog. ## What is zero-shot learning? Zero-shot learning (ZSL) is a machine learning technique that allows an AI model to recognize and categorize objects or concepts it has never seen before. In a typical supervised learning model, an AI is trained on a large dataset of labeled examples. For instance, a model might be trained on thousands of images of cats and dogs, with each image labeled as either "cat" or "dog." The model then learns to distinguish between cats and dogs based on the patterns it observes in the training data. Zero-shot learning, on the other hand, does not require any labeled examples of the categories it is being asked to classify. Instead, it relies on auxiliary information, such as textual descriptions or attributes, to make predictions. For example, a zero-shot learning model could be told that a "zebra" is a "striped horse." Even if the model has never seen a zebra before, it can use its existing knowledge of horses and stripes to recognize a zebra in an image. Labelbox's bulk classification using zero-shot learning is a great way to speed up your classification projects, as you can quickly generate classifications without having to label each asset one-by-one. This allows you to integrate off-the-shelf neural networks as zero-shot classifiers. ## How to bulk classify data Here is a step-by-step guide on how to bulk classify data in Labelbox. 1. **Select a subset of data to classify:** Use the search features in Labelbox to select a subset of data. You can select the top results of a natural language search or all assets that look similar to each other using similarity search. 2. **Add a classification:** Click on the number of items selected in the top right corner and choose **Add classification**. Add Classification 3. **Pick the destination labeling project:** Select the destination labeling project from the dropdown menu. Only projects with a global classification question in their ontology will appear. 4. **Provide classification values:** Answer the classification questions that appear. These values will apply to all data rows in the bulk classification job. You must answer all required classifications and subclassifications. You can search for a classification question by typing its name. 5. **Specify the workflow step:** Choose which step of the labeling and review workflow the data rows should be sent to. For example, if you pick the **Initial labeling task**, then the classifications will be sent as [*pre-labels*](/docs/model-assisted-labeling). If you pick any other task - such as **Rework**, **Initial review task**, or **Done** - then the classifications will be sent as [*labels*](/docs/import-ground-truth). 6. **Include or exclude data rows that already have labels:** If a data row already has a label in the destination project, you can either *overwrite* the existing label or *exclude* the data row from the job. * **Overwrite previous value with new value**: Only classification questions that you answer -- as part of the bulk classification job -- will overwrite already-existing classifications. * **Exclude from job**: Excludes these data rows from the bulk classification job to preserve the already-existing labels. 7. **Submit the bulk classification:** Click **Submit batch** to launch the job. It may take a few moments for the data to be sent to the labeling project. 8. **Track the progress:** You can track the progress of the job in the notification panel and will be notified by a pop-up message when it's complete. ## Limitations There are a few limitations to keep in mind when using bulk classification: * There are limits to the number of data rows you can classify at once. You can work around this by classifying data in chunks. For example, you can leverage the **Annotation** filter with the attributes **is not one of** and \[your classification question] to filter out already classified data rows and proceed with the next bulk classification job. * You cannot use bulk classification with a consensus project. * Benchmark data rows are excluded from bulk classification jobs. Since benchmark labels are a gold standard label, they should not be overwritten by bulk classifications. The data rows are sent to the labeling project as a [batch](/docs/batches). You can control the name of the batch in the top-left corner, as well as the data row priority in the **Advanced settings**. # Search and view compatibility Source: https://docs.labelbox.com/docs/catalog-search-compatibility A summary of search and view capabilities available in Catalog by data type. Below are the supported search filters that Catalog supports out of the box. You can extend the usability of these core offerings by uploading your own [custom metadata](/docs/datarow-metadata#custom-fields) and [embeddings](/docs/similarity#supported-embeddings), then combining them with the supported filters below to accomplish your search and curation objectives. ## Supported search filters (basic) To learn more about the supported filters, see [Filters](/docs/search). | Asset type | Annotation | Dataset | Metadata | Project | Media attribute | Batch | Data row | | ------------------ | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | | **Image** | | | | | | | | | **Video** | | | | | | | | | **Text** | | | | | Only mime type | | | | **HTML** | | | | | Only mime type | | | | **Document** | | | | | Only mime type | | | | **Tiled imagery** | | | | | Only mime type | | | | **Audio** | | | | | | | | | **Conversational** | | | | | Only mime type | | | ### Enrich your data with custom metadata As noted above, Labelbox supports metadata on any data type. To learn how to use custom metadata and these Catalog filters to enrich your data, read this blog post on [how to make your data queryable using foundation models](https://labelbox.com/blog/make-videos-queryable-using-foundation-models/). ## Supported search filters (advanced) | Asset type | [Find text](/docs/find-text) | [Natural language](/docs/natural-language-search) | | ------------------ | ---------------------------- | ------------------------------------------------- | | **Image** | - | \* | | **Video** | - | (beta, add-on feature) | | **Text** | | \*\* | | **HTML** | | \*\* | | **Document** | | \*\*\* | | **Tiled imagery** | - | \* | | **Audio** | - | - | | **Conversational** | | \*\* | \* Uses the off-the-shelf [CLIP-ViT-B-32](https://huggingface.co/sentence-transformers/clip-ViT-B-32) vision model (512 dimensions). \*\* Uses the off-the-shelf [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) text model (768 dimensions), based on the first 64k characters. \*\*\* Users can pick between off-the-shelf [CLIP-ViT-B-32](https://huggingface.co/sentence-transformers/clip-ViT-B-32) vision model (512 dimensions) and [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) text model (768 dimensions, based on the first 64k characters). ## Similarity (embeddings) | Asset type | Off-the-shelf embeddings | Custom embeddings | | ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------- | | **Image** | [CLIP-ViT-B-32](https://huggingface.co/sentence-transformers/clip-ViT-B-32) (512 dimensions) | Up to 2048 dimensions per embedding; up to 100 custom embeddings per workspace. | | **Video** | [Google Gemini Pro Vision](https://ai.google.dev/gemini-api/docs/models/gemini) . First two (2) minutes of content is embedded. Audio signal is not used currently. This is a paid add-on feature available upon request. | Up to 2048 dimensions per embedding; up to 100 custom embeddings per workspace. | | **Text** | [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) (768 dimensions) | Up to 2048 dimensions per embedding; up to 100 custom embeddings per workspace. | | **HTML** | [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) (768 dimensions) | Up to 2048 dimensions per embedding; up to 100 custom embeddings per workspace. | | **Document** | [CLIP-ViT-B-32](https://huggingface.co/sentence-transformers/clip-ViT-B-32) (512 dimensions) and[all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) (768 dimensions) | Up to 2048 dimensions per embedding; up to 100 custom embeddings per workspace. | | **Tiled imagery** | [CLIP-ViT-B-32](https://huggingface.co/sentence-transformers/clip-ViT-B-32) (512 dimensions) | Up to 2048 dimensions per embedding; up to 100 custom embeddings per workspace. | | **Audio** | Audio is transcribed to text.[all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) (768 dimensions) | Up to 2048 dimensions per embedding; up to 100 custom embeddings per workspace. | | **Conversational** | [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) (768 dimensions) | Up to 2048 dimensions per embedding; up to 100 custom embeddings per workspace. | ### Enhance similarity search with custom embeddings As noted above, Labelbox supports custom embeddings on any data type. Powerful embeddings can be generated using foundational models and easily uploaded to Labelbox. You can then use these embeddings in combination with any of the above filters to accomplish your data search goals. For an example of how to get started, check out this guide on [how to generate custom embeddings using foundational models and upload them to Labelbox](https://labelbox.com/guides/using-labelbox-and-hugging-face-to-generate-custom-embeddings-and-curate-impactful-data/). ## Options for viewing your data | Asset type | Thumbnail view | Detail view | Annotations overlay (thumbnail) | Annotations overlay (detail) | | ------------------ | --------------------- | --------------------- | ------------------------------- | ---------------------------- | | **Image** | | | | | | **Video** | | | - | Classifications only | | **Text** | | | | | | **HTML** | | | - | | | **Document** | | | Classifications only | Classifications only | | **Tiled imagery** | | | | | | **Audio** | - | - | - | | | **Conversational** | | | | | # Import data Source: https://docs.labelbox.com/docs/connect-to-cloud-storage Learn how to import your data by connecting your cloud storage to Labelbox. ## Overview To start labeling your data, you first need to grant our platform secure access to the files stored in your private cloud (AWS, GCP, or Azure). This guide explains the two methods for connecting your data, helping you choose the best one for your project's security and workflow needs. The two methods are * **IAM Delegated Access**: A robust, long-term connection method. * **Signed URLs**: A flexible method using temporary, secure links to your data. **Our Recommendation**: For most use cases, especially long-term projects, we recommend **IAM Delegated Access** for its superior security and lower maintenance. | Feature | IAM Delegated Access | Signed URLs | | ---------------- | ------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------- | | Setup complexity | **High.** Requires a one-time configuration within your cloud provider's IAM console. | **Low.** No initial cloud configuration is needed in Labelbox. | | Maintenance | **Low.** "Set it and forget it." Works for all data in the configured location. | **High.** Requires a continuously running service on your end to generate new URLs. | | Data Freshness | **Real-time.** New data added to your bucket is immediately available for labeling. | **Delayed.** New data requires new signed URLs to be generated and uploaded to Labelbox. | | Ideal for | Long-term projects, enterprise-scale data operations, and stringent security environments. | Quick-start projects, proof-of-concepts, or when you cannot create IAM roles. | *** ## IAM delegated access IAM (Identity and Access Management) Delegated Access is the most secure and scalable method for connecting your data. You create a trust relationship by setting up a dedicated role within your own cloud account that Labelbox is permitted to assume. This gives Labelbox temporary, read-only credentials to access your data when your users are labeling. ### How it works 1. **You**: Create an IAM role in your AWS, GCP, or Azure account that has read-only permissions to your data bucket. 2. **You**: Provide Labelbox with the unique identifier (ARN/ID) of that role. 3. **Labelbox**: When a user needs to view an image or document, Labelbox uses the provided identifier to request temporary access credentials from your cloud provider. 4. **Your Cloud Provider**: Validates the request and grants Labelbox a short-lived token to access only the specified data. ### Key advantages * **Superior Security**: Your secret keys are never shared with Labelbox. Access is easily auditable and can be revoked at any time from your cloud console. * **Low Maintenance**: After the initial setup, you never have to worry about managing credentials or access again. It just works. * **Simplified Workflow**: Data scientists and annotators can easily browse and import data without needing to handle URLs. ### Step-by-step guides *** ## Signed URLs A signed URL is a web link that provides temporary access to a specific file in your storage bucket. Each URL is "signed" with cryptographic keys that validate the request and expire after a set time (e.g., 7 days). You are responsible for generating these URLs and providing them to Labelbox. ### How it works 1. **You**: Write and run a script or service that generates a unique signed URL for each data asset you want to label. 2. **You**: Create a JSON file containing these URLs and upload it to Labelbox. 3. **Labelbox**: When a user accesses a task, Labelbox uses the corresponding signed URL from your JSON file to fetch and display the data. 4. **Your Cloud Provider**: Validates the signature on the URL and serves the file. Access is denied if the URL has expired. ### Key advantages * **Fast to Start**: Bypasses the need for complex IAM configuration, making it ideal for quick tests or proof-of-concepts. * **Granular Control**: You have explicit, file-level control over what data is accessible and for how long. ### Step-by-step guide *** ## Automate imports with the Python SDK Labelbox provides a [Python SDK](/reference/install-python-sdk) to help automate data setup. You can download sample code from the app or use the online docs to learn more. To download samples from the app: 1. From the dataset default screen, select **Use Python SDK**. 2. From the **Create data rows** prompt, select the tab appropriate for your data type. 3. Use the **Copy** button to copy the code to the Clipboard or the **Download** button to save it locally. Once you have a copy of the sample script, you can customize for your needs. More information is available for each supported data type, including: *** ## Upload local files To upload local files directly to the Labelbox platform, go to **Catalog**, click **+New**, then select **Choose files to upload**. **Direct upload not recommended** Uploading your files to Labelbox is NOT recommended. We recommend using IAM delegated access or Signed URLs instead (see sections above). # Consensus Source: https://docs.labelbox.com/docs/consensus Learn how to set up consensus scoring to analyze the quality of your labels. Consensus represents the agreement between your labeling workforce. Consensus agreement scores are calculated in real-time for features and labels with multiple annotations by different labelers. Whenever an annotation is created, updated, or deleted, the consensus score is recalculated for data rows with two or more labels. ## Supported types Currently, consensus scoring is supported for the following asset and annotation types: | Asset type | Bounding box | Polygon | Polyline | Point | Segmentation mask | Entity | Relationship | Radio | Checklist | Free-form text | | ------------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | | Image | | | | | | N/A | N/A | | | - | | Video | | - | | | - | N/A | N/A | | | - | | Text | N/A | N/A | N/A | N/A | N/A | | - | | | - | | Chat | N/A | N/A | N/A | N/A | N/A | | - | | | - | | Audio | N/A | N/A | N/A | N/A | N/A | N/A | N/A | | | - | | Geospatial | | | | | N/A | | N/A | | | - | | Documents | | N/A | N/A | N/A | N/A | N/A | N/A | | N/A | - | | HTML | N/A | N/A | N/A | N/A | N/A | N/A | | | N/A | - | | Human-generated responses | N/A | N/A | N/A | N/A | N/A | N/A | | | N/A | | ## Set up consensus scoring When adding data rows to an **Annotate** project, use the **Queue batch** option to enable consensus scoring and configure additional settings. You can't change these settings after submission. | Consensus setting | Description | | --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | | **Data row priority** | The position in the labeling queue these data rows will be slotted based on priority. | | **% coverage** | The percentage of the data rows in the batch that will enter the labeling queue as consensus data rows for multi-labeling. Defaults to 0. | | **# labels** | The number of labels that each consensus data row will be added. Defaults to 2. Must be less than or equal to the number of labelers on the project. | ### Consensus calculation can take up to five minutes ## Select consensus winners After a data row is labeled and enters the review stage, the first set of annotations entered for a data row represents consensus by default. Reviewers can reassign consensus to another set of annotations once the data row has more than one label. If your data row has been labeled more than once, you'll view all of the label entries on that data row in the data row browser. The following example shows a data row with two sets of labels. The green trophy icon indicates that the first set of annotations is considered "consensus." To change consensus, click the trophy icon next to the preferred annotations. ### Recalculation of consensus agreement scores The consensus score reflects agreement among labelers, so changing the winning label might lead to a recalculation of the score based on the new consensus. ### Set consensus winners as benchmark references You can designate consensus winners as benchmarks. See [Set up benchmarks](/docs/benchmark#set-up-benchmarks). ## Search and filter data using consensus scores The **Consensus agreement** filter helps you find qualified data rows based on consensus scores. You can apply this filter in the following locations: * The [Data Rows](/docs/data-rows-activity) tab * The [Workflow](/docs/workflows) tab * The [Catalog](/docs/search) page When using the filter, you can configure the following options: * **Scope**: Specify the type of agreement to measure: * **Feature-level** measures the agreement on a specific feature schema in the ontology for each data row. If you select this option, further specify one or more feature schemas in the ontology using the dropdown menu. * * **Label-level** evaluates the overall agreement across all annotations within a single data row. * **Calculation**: Choose whether to calculate the agreement as an absolute or average score. * **Range (0-1)**: Set the score range from 0 to 1, where 0 indicates no agreement among annotators and 1 indicates complete agreement. # Customer support Source: https://docs.labelbox.com/docs/contacting-customer-support How to work with Labelbox support to report issues and feedback. The Labelbox Support Team is available to help you troubleshoot issues and get the most out of the platform. ## Support availability Our support team is available from 9:00 AM to 8:00 PM Eastern Time, Monday through Friday, excluding major U.S. holidays. ## Status page Before submitting a support ticket, please check our [Status page](https://status.labelbox.com/ "https://status.labelbox.com/") to see if there are any ongoing incidents or scheduled maintenance that might be affecting the platform. This can often provide immediate answers and updates regarding widespread issues. ## Labelbox support bot Our AI-powered support bot in the Labelbox platform is your first stop for quick questions and information retrieval. We recommend using the bot to scan our documentation, look up SDK functions, or learn about different data export methods. To get the most accurate answers, it's best to ask clear and specific questions. For example, instead of asking a general question like "How do I script?", try asking something more precise, such as "What is the SDK function to create a new project?" Well-formulated questions will help the bot give you the most insightful response. Additionally, you can use the **Ask a question** chat bot from anywhere in the *Labelbox docs*. This AI-powered chat bot allows you quickly retrieve specific content from our docs (e.g., SDK functions, export methods, etc). ## How to file a support ticket To file a support ticket, you'll need to use the Labelbox support portal. Access to the portal is by invitation only. If you don't have an account, please contact your sales representative. To help our support team resolve your issue as quickly as possible, please include the following information in your ticket: | Field | Description | | ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | On behalf of | The email address entered in this field should belong to the primary stakeholder of the concern being reported. The ticket will be associated with the Jira portal identity for this user and the email address should be associated with a Labelbox account. Other users can be added as `Request participants` after the creation of the ticket in order to view and participate in the thread. | | Affected email address | The email address(es) entered in this field should represent the Labelbox account(s) in which the relevant issue is occurring. Typically, the entry here will be the same as the above field, but this is not always the case when reporting a team-related concern. | | Issue urgency | Select the urgency level that best describes your issue.
  • **Critical**: *Labelbox will not load or I cannot log in.* (Example: Labelbox will not load for any team members; team members are unable to log in).
  • **High**: *I'm mostly blocked from using Labelbox.* (Example: I cannot access a specific feature or functionality of Labelbox).
  • **Medium**: *I'm finding it difficult to use Labelbox.* (Example: The pen tool is not working properly for a large number of team members).
  • **Low**: *I'm running into minor issues using Labelbox.* (Example: The usage report on the Account page is not working properly).
  • **No Urgency**: Utilize this option for general questions or any concern that is not a bug.
| Additionally, you will need to include the following information in your ticket: ### Workspace ID Click your initials in the left panel, select **Workspace settings**, and the copy the **Workspace ID**. ### Project ID Open a project and copy the ID from the URL. The Project ID will be the string of characters after `projects/` and before the next `/`. ### Dataset ID Go to **Catalog**, select your dataset, and copy the ID from the URL. The Dataset ID will be the string of characters after `catalog/dataset/`. ### Data Row ID and Label ID Go to **Catalog**, click on a data row to open the detailed view, and find the Data row ID on the right side of the panel. You can also find the data row ID and label ID by selecting a project and navigating the **Data Rows** tab. Click on any data row to open the **Data browser**. * In the URL, the data row ID will be located immediately after `data-rows/` and span until the next `/` in the URL ( *in red*), followed by the label ID (*in green*). * The label ID will appear in the left panel (*in green*). Data rows that have not been labeled will not have a label ID. Asset ID and Data row ID are used synonymously and refer to the Labelbox-generated IDs assigned to each data row in an organization. A Label ID is only generated once annotations have been submitted on a data row. Keep in mind, when utilizing Benchmarks or Consensus, there may be more than one Label ID associated with a single data row. ## How to view your support tickets You can view your support tickets in two ways: 1. **From your email:** After you create a ticket, the email address entered in the **on behalf of** field will receive an email from Labelbox's Customer Help Portal. The subject of the email will follow the format of `LS-#### `, where `Summary` is the summary you entered when filing the ticket. When you receive the email, click on **View request** to open the ticket in the Customer Help Portal. If someone else raised the ticket on your behalf and you have never used the portal, you may need to complete a brief sign-up procedure. 2. **From the support portal:** Log in to the customer support portal and click **Requests** in the top right corner to see all tickets created by you and your team. Once here, you can filter your tickets, monitor their status, and select a ticket directly from the portal. ## Best practices for getting a fast resolution Here are some tips to help us resolve your tickets as quickly as possible: * **Include attachments:** Screenshots, screen recordings, JSON files, Python snippets, or any other resources that provide context are always helpful. * **Create separate tickets for separate issues:** If you have multiple, unrelated questions and concerns, please open separate tickets. When possible, please refrain from raising a new issue on an existing ticket. Keeping issues separate helps us reach resolutions faster as a team and allows us to better monitor trends so that we can continue to improve our Support overall. * **Comment to re-open a ticket:** If a ticket has been closed by the Support team and the same issue emerges -- or perhaps you have more feedback to add -- you can always comment on the ticket and it will automatically re-open. * **Use the customer support portal:** Communication on tickets will be clearer, more efficient, and easier to digest on the Jira portal compared to over email. In addition, you can use the portal to view all tickets raised by your team, which can illuminate trends and enable the consolidation of your efforts. # Conversational text Source: https://docs.labelbox.com/docs/conversational-editor Guide for conversation or thread-based text data. When creating a project, select **Conversations**.

## Import conversational text To learn how to import conversational text data, visit our documentation on [importing conversational text](/reference/import-conversational-text-data). ### Data row size limit To view the maximum size allowed for a data row, visit our [limits](/docs/limits) page. ## Supported annotation types Below are all of the annotation types you may include in your ontology when you are labeling conversational or thread data. Classification-type annotations can be applied globally and/or nested within an object-type annotation. | Feature | Import annotations | Export annotations | | -------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------ | | **Entity** | [See payload](/reference/import-conversational-text-annotations#entity-message-based) | [See payload](/reference/export-conversational-text-annotations#text-entity-named-entity) | | **Relationships** | [See payload](/reference/import-conversational-text-annotations#relationship-with-entity-message-based) | [See payload](/reference/export-conversational-text-annotations#relationship) | | **Radio classification** (Global or message-based) | [See payload](/reference/import-conversational-text-annotations#classification-radio-single-choice-message-based) | [See payload](/reference/export-conversational-text-annotations#classification---radio) | | **Checklist classification** (Global) | [See payload](/reference/import-conversational-text-annotations#classification-checklist-multi-choice-message-based) | [See payload](/reference/export-conversational-text-annotations#classification---checklist) | | **Free-form text classification** (Global) | [See payload](/reference/import-conversational-text-annotations#classification-free-form-text-message-based) | [See payload](/reference/export-conversational-text-annotations#classification---free-form-text) | ### Entity To create an entity, simply choose the entity tool in your ontology and select the text string by clicking the desired starting character and dragging to select a sequence of characters in the unstructured text. Please note that in both the conversational and thread-based UI, the annotator will only be able to label messages that have been marked as `canLabel` in the import file. The messages that can be annotated will have a white background while those that cannot be annotated will have a grey background. ### Message-based radio classifications One unique feature of our conversation editor is the ability to label specific messages in a conversation with a radio classification value. This enables you to annotate messages with values such as intent or user sentiment. In order to configure your radio classification as a message-based classification, you must configure your radio classification task as a **Frame/pixel-based** classification value during ontology configuration. If this value is not configured correctly, the radio classification will apply to the entire conversation rather than a single message. ### Relationship With annotation relationships, you can create and define relationships between entity annotations in the conversational text editor. You can then use these annotation relationships to consolidate labeling workflows and potentially reduce the number of language models needed. Follow these steps to create a relationship between two entity annotations: 1. In the editor, create two entity annotations. 2. Select the relationship annotation from the Tools menu, then click on one of the entity annotations. To create the relationship, move your cursor to another entity and click to create the relationship. 3. Add an optional subclassification to the relationship. ## Text-specific hotkeys | Function | Hotkey | Description | | --------------------- | ------------------------------------------ | -------------------------------------------------------------------------------------------------------- | | Create a relationship | `Option` + Click (source) + Click (target) | Select an entity to be the source of a relationship, then connect it to another entity to be the target. | # Create a model run Source: https://docs.labelbox.com/docs/create-a-model-run This guide will walk you through the process of creating your first model run in Labelbox. By the end of this tutorial, you will have a new model run configured with your model's predictions, ready for analysis. ## Before you start * **Import data rows**: You’ll need to have a set of data rows to upload the predictions on. If you do not already have a set of data rows in Labelbox, you’ll need to import that data first. * **Create an ontology**: In order to create a model run, you’ll need to specify the ontology (also called taxonomy) that corresponds to the set of predictions. You may want to re-use an ontology that already exists in Labelbox (e.g., an ontology already used for a labeling project). Or, you may want to use an ontology for your model predictions that does not exist in Labelbox yet. In the latter case, you’ll need to create an ontology. ## Step 1: Create an experiment and your first model run An **experiment** is the top-level container for a specific modeling task (e.g., "Detecting Defects in Solar Panels"). Within that experiment, each training iteration is tracked as a **model run**. A model run holds a specific set of predictions and the model configuration used to generate them. You have two primary ways to create an experiment: * **Option A: Start from Catalog** 1. Go to the **Catalog** tab. 2. Use the filters to select the dataset or a subset of data rows you want to use for this experiment. 3. Click the **Manage selection** button at the bottom of the screen. 4. Select **New experiment** from the action menu. * **Option B: Start from the Model tab** 1. Navigate to the **Model** tab. 2. Click the **+ Create** button in the top right and select **Experiment**. 3. Select the batch of data rows you wish to include in this experiment. Once you create the experiment, you will be immediately prompted to configure its first model run. Give your model run a descriptive name that will help you identify it later, such as `YOLOv8-baseline-v1` or `ResNet50-initial-training`. Then, specify the ontology to use for the model run. If you are iterating multiple model experiments on a machine learning task, the best practice is to put your model runs under the same experiment. This allows you to visualize and compare the performance of the different model runs. Next, you will see several optional but highly recommended steps for configuring your model run. While you can adjust these settings at any time, configuring them now will significantly streamline your analysis workflow. ### Optional step: Include ground truth labels To automatically calculate performance metrics like precision, recall, and IoU, Labelbox needs to compare your model's predictions against a "source of truth". This step allows you to link your model run to a Labelbox Project that contains your ground truth annotations. * **Why this is important:** Without ground truth, you can visualize your model's predictions, but you cannot quantitatively score its performance or identify where it is correct or incorrect. * **How to do it:** Simply select the project containing the relevant, reviewed labels from the dropdown menu. If you don't have labeled data yet or if it resides in a different project, you can skip this for now and link it later from the model run settings. ### Optional step: Include existing model predictions If you have already generated a set of predictions from your model, you can associate them with this run immediately. * **Why this is important:** This step populates your model run with your model's output, making it ready for analysis as soon as you finish the setup. * **How to do it:** The primary method for uploading predictions is via our Python SDK. In this initial setup screen, you can select an existing upload or choose to upload them in the next step. The detailed, step-by-step guide for generating and uploading this payload is covered in the next section of this tutorial. ### Optional step: Define data splits A fundamental practice in machine learning is to segment your data into `Training`, `Validation`, and `Test` sets. This helps you evaluate your model's ability to generalize to new, unseen data. * **Why this is important:** Creating splits allows you to analyze your model's performance on your validation or test data separately from the data it was trained on. This is critical for diagnosing overfitting and ensuring your model will perform well in the real world. * **How to do it:** You have two flexible options for creating splits: 1. **Split by percentage:** Easily divide your data by specifying a percentage for each split (e.g., 80% training, 10% validation, 10% test). Labelbox will handle the random assignment of data rows. 2. **Use existing slices:** For more control, you can assign pre-existing Slices to your splits. This is useful if you have specific, curated datasets you want to use for validation or testing. After completing these optional steps, click **Create model run**. You have now created a fully configured structure for your experiment. The next step is to populate it with your model's outputs. ## Step 2: Upload predictions to the model run This is the most critical step. Here, you will upload your model's predictions to the model run you just created. This allows Labelbox to visualize your model's outputs against the ground truth labels and calculate performance metrics. The most powerful and flexible way to upload predictions is by using our Python SDK. ### Conceptual overview The process involves formatting your predictions into a specific structure that Labelbox can understand and then using an SDK command to upload them. Each prediction must be linked to a specific `Data Row ID` to ensure it is matched with the correct source media (image, text, or video). Your predictions can be simple (e.g., a bounding box and a class name) or they can include optional information like confidence scores, which unlock more powerful analysis like building precision-recall curves. ### Step-by-step guides For detailed instructions and code examples for uploading predictions to your model via the Python SDK, please visit these pages: ## Step 3: Update your model run config file For your experiments to be scientific and reproducible, you must keep track of what changed between each model run. Labelbox allows you to store a configuration file (as JSON) with every run. This is the perfect place to log hyperparameters, model versions, or data preprocessing steps. Steps to edit the model run's configuration file: 1. Go to your **Model Run** page. 2. Click the Settings icon and select **Model run config**. 3. Edit the JSON file to include the hyperparameters for this model run. Example configuration: ```json theme={null} { "model_architecture": "YOLOv8-large", "training_epochs": 150, "learning_rate": 0.001, "optimizer": "Adam", "image_size": "640x640", "data_augmentation": { "horizontal_flip": true, "rotation_range": 15 } } ``` **Congratulations!** You have successfully created and configured your first Model Run. You are now ready to dive into the analysis tools to see what your model has learned. # Create a project Source: https://docs.labelbox.com/docs/create-a-project Instructions for creating and modifying a labeling project in Labelbox. This guide will walk you through the essential steps of creating a new project, as well as managing its lifecycle through duplication and deletion. ## How to create a new project 1. Navigate to the **Projects** tab in the main Labelbox application. 2. Click the **"New Project"** button in the top right corner. 3. **Choose the data modality or task type**. Note that you will only be able to attach data rows that match the data type you set here. For example, you cannot select video as the data type upon project creation and then send image data rows to this project later. 4. Give your project a clear, descriptive **Name**. You may optionally add a **Description** and project **tags**. The, click **Save**. 5. From the project overview, click **Add data** to select the data rows to include in this project. Refer to our [Catalog docs](/docs/catalog-overview) to learn how to import data. 6. In this step, you can [filter and search](/docs/filtering-and-sorting) your data rows to curate a batch. Select the data rows you wish to send to the project, then click **Queue batch of *n*** to add the data rows to the labeling queue. 7. In the **Advanced batch settings**, you can optionally set the **Data row priority** and toggle on **Consensus** mode. Select **Submit** to send this batch to your labeling project. 8. The final step is to **Set up ontology and labeling experience**. You may either create a new ontology or reuse an existing one. 9. Once you complete these steps, the **Start labeling** button will be activated in your project overview. ## How to duplicate a project Duplicating a project is a great way to reuse a complex setup without starting from scratch. When you duplicate a project, you copy its ontology, labeling interface settings, and all other configurations. 1. From the **Projects** tab, find the project you wish to duplicate. Open the project. 2. Click on the project name at the top. From the dropdown menu, select **Duplicate project**. 3. A new project will be created with the same settings. Below are the settings that can get copied when you duplicate a project: | Setting | Copied to duplicate version | | ------------------------------------------ | ---------------------------- | | Project Name | | | Project Tags | | | Project Description | | | Ontology | | | Quality | | | Batches | | | Data rows | | | Workflow instructions/setup | | | Performance | | | Issues | | | Issue Categories | | | Notifications | | | Import labels | | | Export | | | Members (settings) | | | External workforces | | | Models (for Live multimodal chat projects) | | ## How to delete a project If a project is no longer needed, you can delete it to keep your workspace organized. 1. From the **Projects** tab, find the project you wish to duplicate. Open the project. 2. Click on the project name at the top. From the dropdown menu, select **Delete project**. 3. A confirmation dialog will appear. Type the name of the project to confirm and click **"Delete"**. Deleting a project is a permanent action and cannot be undone. ## How to create project tags As you create more projects in Labelbox, keeping them organized becomes crucial. Tags are a simple yet powerful way to categorize, filter, and manage your projects. Benefits of using project tags: * **Categorization:** Group projects by team (e.g., `team-alpha`, `team-beta`), status (`in-progress`, `completed`), or data type (`images-night`, `text-legal`). * **Easy filtering:** Quickly find all projects associated with a specific tag, saving you time from manually searching. * **Improved visibility:** Get a clear overview of your labeling initiatives by filtering your project list by tag. **To create and apply tags** 1. From the **Projects** tab, select a project. 2. From the project overview, hover over the **Tags** section and click the **Edit** icon. 3. You can either *select an existing tag* from the dropdown list or *type a new tag* and press **Enter** to create it. A project can have multiple tags. **Filtering by tags** Use the filter bar at the top of the **Projects** page to filter by tag. Click on the tag you want to filter by, and the view will instantly update to show only the projects with that tag. You can select multiple tags to further refine your search. **To delete a tag** To manage existing tags, select a project, click the **Edit** icon next to **Tags** in the project overview. Click **Manage tags** and click the **Delete** icon next to the tag name. Deleting a tag removes it from all associated projects, and modifying a tag updates it across all associated projects. # Configure CORS Source: https://docs.labelbox.com/docs/create-cors-headers Instructions for configuring Cross-Origin Resource Sharing (CORS) for AWS S3, GCS, and Microsoft Azure. This guide explains what a CORS policy is and how to configure it correctly in AWS, GCP, and Azure. Completing this step ensures that your data assets will render correctly within the Labelbox application. ### What is CORS and why is it necessary? * **The Problem:** Modern web browsers enforce a "Same-Origin Policy" to protect you from malicious websites. This policy prevents a web page (like the Labelbox app at `https://app.labelbox.com`) from making requests to a different domain (like your storage bucket at `your-bucket.s3.amazonaws.com`). * **The Solution:** A CORS policy is a small configuration file you add to your storage bucket. It acts as a permission slip, telling the browser, "It's okay to allow requests from `https://app.labelbox.com` to access files in this bucket." When configuring CORS for your cloud storage bucket, you will need to include both of these Labelbox origins: `https://app.labelbox.com` and `https://editor.labelbox.com`. ## AWS S3 1. Navigate to your S3 bucket in the AWS Console. 2. Click the bucket name. 3. Go to the **Permissions** tab. 4. Scroll to the **Cross-origin resource sharing (CORS)** section, click **Edit**. 5. Paste the provided policy into the editor and save. ```json JSON theme={null} [ { "AllowedHeaders": [ "*" ], "AllowedMethods": [ "GET" ], "AllowedOrigins": [ "https://app.labelbox.com", "https://editor.labelbox.com" ], "ExposeHeaders": [] } ] ``` 7. For more details on setting up CORS for your AWS S3 bucket, see [these AWS docs](https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html). ## Google Cloud Storage (GCS) The configuration must allow requests from both `https://app.labelbox.com` and `https://editor.labelbox.com` to ensure data loads correctly across the entire platform. You can set this policy using one of the two methods below. ### Option 1: Using the gcloud CLI on your local machine This method is ideal if you have the `gcloud` command-line tool installed on your computer. 1. Create a new file on your computer named `cors-config.json`. 2. Copy and paste the following JSON content into the file. Note: The `"*"` for `responseHeader` allows all necessary headers and is recommended for robust compatibility. ```json theme={null} [ { "origin": ["https://app.labelbox.com", "https://editor.labelbox.com"], "method": ["GET"], "responseHeader": ["*"], "maxAgeSeconds": 3600 } ] ``` 3. Open your terminal or command prompt and run the following command. Be sure to replace `[BUCKET_NAME]` with the actual name of your GCS bucket. ```shellscript theme={null} gcloud storage buckets update gs://[BUCKET_NAME] --cors-file=cors-config.json ``` ### Option 2: Using the Google Cloud Shell This method is convenient if you prefer to work directly within the Google Cloud Console without creating local files. 1. Open the [Google Cloud Shell](https://shell.cloud.google.com/ "https://shell.cloud.google.com/") from your GCP console. 2. Run the following command in the Cloud Shell terminal. This command creates the `cors-config.json` file and writes the correct configuration to it in a single step. ```shellscript theme={null} echo '[{"origin": ["https://app.labelbox.com", "https://editor.labelbox.com"], "method": ["GET"], "responseHeader": ["*"], "maxAgeSeconds": 3600}]' > cors-config.json ``` 3. Next, run the command below to apply the configuration to your bucket. Remember to replace `[BUCKET_NAME]` with your bucket's name. ```shellscript theme={null} gcloud storage buckets update gs://[BUCKET_NAME] --cors-file=cors-config.json ``` ## Microsoft Azure 1. Navigate to the your Storage Account in your Azure Portal. 2. Under **Settings**, find **Resource sharing (CORS)**. 3. Select the **Blob Service** tab. 4. Fill in the fields: Allowed origins (`https'//app.labelbox.com` and `https://editor.labelbox.com`). Allowed methods (`GET`), etc. and save. ## Troubleshooting | Error message | Troubleshooting | | ------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | Unable to detect proper CORS configuration | Ensure your cloud storage bucket has CORS configured with the following origins: `https://app.labelbox.com` and `https://editor.labelbox.com`. For more troubleshooting help, see: - [AWS troubleshooting CORS](https://docs.aws.amazon.com/AmazonS3/latest/userguide/cors-troubleshooting.html) - [GCS troubleshooting CORS requests](https://cloud.google.com/storage/docs/configuring-cors#troubleshooting) - [Cross-Origin Resource Sharing (CORS) support for Azure Storage](https://docs.microsoft.com/en-us/rest/api/storageservices/cross-origin-resource-sharing--cors--support-for-the-azure-storage-services) | # Custom model integration Source: https://docs.labelbox.com/docs/custom-model-integration Describes how to set up and integrate a custom model so that it can be used with Foundry. If you are on an enterprise plan, you can integrate custom models with [Foundry](/docs/foundry) to use them to predict labels, enrich data, and generate responses for evaluation purposes. To upgrade to the enterprise plan, please [contact sales](https://labelbox.com/sales/). ## Host custom models Before integrating your custom model, you need to deploy it on an HTTP endpoint accessible via the Internet that accepts HTTP POST calls with a JSON payload. You can host it either on your own infrastructure or through any model hosting vendor, such as [Vertex AI](https://cloud.google.com/vertex-ai/docs/general/deployment), [Databricks](https://docs.databricks.com/en/machine-learning/model-serving/create-manage-serving-endpoints.html), [Huggingface](https://huggingface.co/inference-endpoints), [Replicate](https://replicate.com/docs/how-does-replicate-work#private-models), [OpenAI](https://platform.openai.com/docs/guides/fine-tuning/use-a-fine-tuned-model). ## Create model integrations Once you have a public HTTP endpoint for your custom model, you can create the integration: 1. On the [Models](https://app.labelbox.com/mea) page, click **Create** and select **Custom Model**. 2. Select the data type for the model. 3. Add custom model information, including: * **Name**: A unique identifier for the model. * **HTTP endpoint**: The URL of the HTTP endpoint hosting your model. * **Secret** (optional): The authentication token for secret-secured endpoints only. * **Description** (optional): The descriptive context of the model. 4. Click **Create model**. On the **Settings** tab, you can review and edit the model information. You can add a rate limit and a Readme. To send data to your model for label prediction, click **+ Model run**. From there, you can define and [preview your model run](/docs/foundry-define-model-run), [view prediction and details](/docs/foundry-view-predictions), and [send predictions to Annotate](/docs/foundry-annotate-predictions). ### Bounding box and mask tasks not supported Currently, this model integration flow doesn't support tasks involving bounding box and mask annotations. To integrate a custom model for these tasks, see [Create model integrations for bounding box and mask tasks](#create-model-integrations-for-bounding-box-and-mask-tasks). ## Create model integrations for bounding box and mask tasks For a custom model predicting bounding box and mask labels, you need to create a model manifest file and [contact customer solutions](https://labelbox.atlassian.net/servicedesk/customer/portal/2/group/3/create/214) to manually establish the integration. The Labelbox solutions team can help you manage the job queuing, track status, and process predictions using the Labelbox platform. ### Create manifest files To integrate your model into the Foundry workflow, you need to specify and provide a `model.yaml` manifest file. This file stores metadata about the model, including its name, description, inference parameters, model output ontology, API endpoint, and other details. You need to create the `model.yaml` file in the following format: ```yaml Example YAML manifest file theme={null} name: My custom model inference_endpoint: my_inference_endpoint # Deploy your service to an API endpoint that can be accessed secrets: my_secret # Your secret, API keys to be authenticated with your endpoint requests_per_second: 0.1 # Your estimate of requests per second description: My awesome custom model for object recognition readme: | # optional readme in markdown format ### Intended Use Object recognition model on my custom classes. ### Limitations My custom model has limitations, such as ... ### Citation ... allowed_asset_types: [image] # list of allowed asset types, one or more of [image, "text", "video", "html", "conversational"] allowed_feature_kinds: [text, radio, checklist] # list of allowed feature kinds. One or more of [text, radio, checklist, rectangle, raster-segmentation, named-entity, polygon, point, edge] # Only needed if your model has a predefined set of classes for classification or object detection. If your model is an LLM or takes any text input, you can remove this section. ontology: media_type: IMAGE # This example ontology has two classification classes and two object detection classes. classifications: - instructions: label name: label type: radio options: - label: tench value: tench position: 0 - label: goldfish value: goldfish position: 1 tools: - name: person tool: rectangle - name: bicycle tool: rectangle inference_params_json_schema: # hyperparmeters configured in the app and passed to your API endpoint. properties: # Examples follow, each with different types and defaults. prompt: description: "Prompt to use for text generation" type: string default: "" confidence: description: object confidence threshold for detection type: number default: 0.25 minimum: 0.0 maximum: 1.0 max_new_tokens: description: Maximum number of tokens to generate. Each word is generally 2-3 tokens. type: integer default: 1024 minimum: 100 maximum: 4096 use_image_attachments: description: Set to true if model should also process datarow attachments. type: boolean default: False required: # Use to specify hyperparameters that must have values for each model run. - prompt max_tokens: 1024 # only relevant for LLM to control maximum token size ``` ## Endpoint requests for model tasks Every time you use your integrated custom model to predict labels or run other tasks, it sends a JSON request to your model's endpoint. The request payload provides the data row for prediction and includes the ontology and inference parameter values you selected. Here's an example request body: ```json Example JSON request theme={null} { "prompt": [ { "role": "system", "parts": [ { "text": "Start each sentence with three equal signs ===" } ] }, { "role": "user", "parts": [ { "text": "what is in this text and image?" }, { "text": "Hello. This is a user-provided txt file content." }, { "image": "base64_encoded_image_string" } ] } ] } ``` Here are descriptions of fields in the request body: * `prompt`: Contains the current conversation with the model. For single-turn queries, it’s a single instance. For multi-turn queries, it includes conversation history and the latest request. Each `prompt` has a message structure with two properties: `role` and `parts`. * `role`: A string indicating the individual producing the message content. Possible values include: * `system`: Instructions to the model. * `user`: User-generated message sent by a real person. * `assistant`: Model-generated message, used to insert responses from the model during multi-turn conversations. * `parts`: A list of ordered parts that make up a multi-part message content. It can contain the following segments of data: * `text`: Text prompt or code snippet. * `image`: Base64 encoded image. ### Response Responses are expected to match the format of the labels predicted by the custom model, such as a string containing the raw model response or a JSON object for NER and classifications. Here's an example JSON response with keys corresponding to feature names in the ontology: ```json JSON theme={null} // Object Detection { "cat": { // coordinate order: left, top, width, height "boxes": [[0, 0, 10, 10], [40, 40, 8, 10]], "scores": [0.9, 0.7], }, "dog": { "boxes": [[20, 20, 5, 5]], "scores": [0.8], }, } // Classification { "summary": "Tom and Bob are happy to work at IBM", // Free Text "sentiment": "positive", // Radio classification "emotion": ["joy", "fear"], // Checklist classification } // Segmentation { "cat": { // Can use pycocotools.mask.encode for RLE encoding "masks": [ { "size": [, ], "counts": "" } ] } } // Named Entity { "person": [ {"start": 0, "end": 3, "text": "Tom"}, {"start": 5, "end": 8, "text": "Bob"}, ] } ``` # Manage your project's data rows Source: https://docs.labelbox.com/docs/data-rows-activity The Data Rows tab enables you to easily view, filter, and manage the data rows in your project. The **Data rows** tab in your project dashboard is your central hub for viewing, filtering, and managing all the data rows within a specific project. Whether a data row is labeled, waiting in the queue, or undergoing review, you can find it here. ## Filtering your data rows Each data row displays key information, including how much time has been spent on labeling and review, which dataset it belongs to, and whether there are any open issues. There are several ways you can filter your data. ### Use search filters To find specific data rows, you can use a powerful and flexible filtering system. Click on **Search your data** to get started. You can combine multiple filters using AND/OR conditions to create highly specific queries. For example, you could search for all data rows in a specific `batch` AND that have been `reworked by` a particular labeler. This allows you to precisely target the data you need. Here are the available filters: | Filter | What it does | | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | Dataset | Find data rows belonging to a specific dataset. | | Metadata | Filter data rows based on their metadata fields and values. | | Media attribute | Query data rows based on attributes of the asset itself (e.g., image height or video duration). | | Batch | Find data rows that are part of a particular batch. | | Data row | Search for data rows by their ID, Global Key, creation date, or last activity date. | | Find text | Perform a text search within your data rows. | | Natural language | Use natural language to search for data rows (e.g., "show me all the pictures of cats"). | | Task | Find data rows that are currently in a specific workflow task. | | Issue | Filter data rows based on any issues associated with their labels. | | Label actions | Find data rows based on actions taken on their labels, such as who labeled them, when they were labeled, or if they were skipped. **Note:** Bulk moving a data row to a different step is not considered a review or rework, so it won't appear in filters like `Reworked at`, `Reworked by`, `Reviewed by`, or `Reviewed at`. | | Annotation | Filter data rows based on the number of annotations on their labels. | | Benchmark agreement | Find data rows based on benchmark agreement metrics. | | Consensus agreement | Find data rows based on consensus agreement metrics. | ### Filter data rows by status The status of a data row tells you where it is in the labeling and review process. The following statuses are available by default for every project. You can see a count of data rows for each status on the left panel. Clicking on a status will filter the table to show only the data rows with that status. | Status | What it means | | :-------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | To label | These data rows need to be labeled. For a standard data row, this means it has no labels. For a consensus or benchmark data row, this means it has fewer than the required number of labels (e.g., it has two labels but needs three). | | In review | These data rows are currently in a review task within the workflow. | | In rework | These data rows have been sent back for corrections and are in the rework task. | | Done | These data rows have all the required labels and have successfully passed all review steps in the workflow. | ## Updating your data rows You have several options to update your data rows in bulk or at the individual level ### Task browser: Update individual data rows To view or edit the labels on a single data row, click on the data row in the table. This will open the **Task browser**. From here, you can: * **Edit the label**: Click the **Edit** button to create or edit annotations. Press **Save** to save your changes. * **View all labels (for consensus or benchmark)**: If you are using consensus or benchmarks, you can view all the different labels created for that data row. Use the `>` icon in the left-side panel to see all labels, view their IDs, and select a "winner" label. Click on the options menu next to a data row to access more actions you can take at the data row level. Task Browser Actions From this menu, you can: * **Copy link:** Copy the link to the data row. You can send the link to the data row to other members in your organization. * **Delete labels & requeue:** Remove all existing labels from a selected group of data rows and then send those data rows back to the beginning of the labeling queue to be worked on again. You can * **Move to step:** Choose to move the selected data row to **Initial review task**, **Rework**, or **Done**. * **Add as benchmark:** Select this label as a winning benchmark label. ### Bulk update multiple data rows If you need to update the configurations for many data rows at one time, you can use the Bulk actions feature. Bulk Actions Data Row Tab Select multiple data rows to perform the following actions in bulk: | Action | What it does | | ------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | **Change priority** | Assign a new priority to the selected data rows to change their order in the labeling queue. | | **Delete and requeue** | Allows you to remove all existing labels from a selected group of data rows and then send those data rows back to the beginning of the labeling queue to be worked on again. You can select to preserve the existing labels or delete them. | | **Export data** | Export the selected data rows and their labels. | | **Hide/unhide from labelers** | Control which data rows are visible to labelers. Hidden data rows will have an "Authorized only" tag. Admins can still view the row details, but labelers will only be able to copy the external ID and will see an "Unauthorized to view this data row" message if they try to access it. | | **Move to step** | Choose to move the selected data rows to: **Initial review task**, **Rework**, or **Done**. | | **Assign labels as benchmarks** | Designate the labels on the selected data rows as benchmark reference labels. | Bulk **Move to step** actions are not considered a review or rework. Therefore, these actions will not show up when applying **Reworked at**, **Reworked by**, **Reviewed by**, or **Reviewed at** filters. ## Updating your batches From the Data Rows tab, you can also update the batches of data rows in your project. ### View and update batches Click on **Batches** → **View batches** to view and update all the existing batches in your project. From here, you can: * **Rename the batch**. * **Copy batch ID**. * **Change priority**: When you initially create the batch, you have the option to select the data row priority (value between 1 - 5). Leaving it as-is will default to batch value to 5. From the **View batches** panel you can update the priority for the batch. * **Remove queued data rows**: This will remove any data rows from the batch that have not yet been labeled. * **Delete labels**: This will delete the labels from the data rows in the batch and give you the option to requeue them, either with or without the old label as a template. * **View history**: See a log of when data rows were added or removed from the batch. ### Add more data to batch Click on **Batches** → **Add data from Catalog** to select more data rows to add to the batch. # Metadata Source: https://docs.labelbox.com/docs/datarow-metadata Instructions for adding, filtering, and modifying metadata in the app UI. **Developer guide**: [Metadata](/reference/metadata) *** Metadata is non-annotation information about an asset. You can use metadata to search and filter your data rows in Labelbox. The metadata schema lives at the organization level. This allows you to apply the same metadata fields across multiple datasets. You can use reserved metadata fields and add custom metadata fields. ## Data types All metadata needs to be one of the following types: | Type | Descriptions | Filtering | | --------------- | --------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------- | | `DateTime` | An ISO 8601 datetime field. All times must be in UTC timezone | Equals, greater than, less than, between | | `Number` | Floating-point value (max: 64-bit float) | Greater than & less than | | `String` | free text field. Max 4,096 characters. | Equals & prefix matching | | `Enum` | Enum field with options. Multiple options can be imported. | Equals | | `Option (Enum)` | Option of an enum. Max 64 options can be created per Enum type. 128 for enterprise customers and can be further increased upon request. | Equals | | `Embedding` | 128 float 32 vector used for similarity | | ## Reserved fields By default, Labelbox defines several metadata fields on your data rows. You don't have to use these fields and can change the field values, but you can't delete or rename these fields. Each metadata field has a unique schema ID used to upload data to Labelbox. | Name | Type | Description | | ------------------ | ---------- | ----------------------------------------------------------------------------------------- | | `tag` | `String` | The tags of the data row | | `split` | `Enum` | The split of the dataset that the data row belongs, including`train`, `valid`, and `test` | | `captureDateTime` | `DateTime` | The timestamp when the data is captured | | `skipNFrames` | `Number` | (Video data only) The number of frames to skip | | `turnInstructions` | `String` | JSON string that contains instructions for each turn in a Multi-modal chat conversation. | ## Custom fields To create custom metadata fields via the UI, go to the [**Schema**](https://app.labelbox.com/schema/metadata) tab. Select **Create**. Each metadata field must have a unique name and a type. The max number of fields per organization is determined by account tier. To view your metadata schema, go to the [**Schema tab**](https://app.labelbox.com/schema/metadata) and then select the **Metadata** subtab. ### Metadata field limits You can modify the names of custom metadata schema by clicking the **Edit** button in the detail view of each schema, but you can't change the type once you create a field,. You also can't modify the names of reserved fields. See [limits](/docs/limits) to learn the maximum number of allowed metadata fields of your account. ## Update custom metadata schema via the UI Go to the Schema tab to update custom metadata fields via the UI. Then, find the metadata you want to modify and click the **Edit** button. Only the following can be updated: * The name of custom metadata for [non-reserved](#reserved_fields) metadata * The name of options for Enum metadata ## Bulk add metadata Follow these steps to bulk add metadata to your data rows in **Catalog**. See [limits](/docs/limits) to learn the limits for bulk adding metadata to data rows in Catalog. ### Step 1: Select data rows You will need to select a curated subset of data. For example, you may select a cluster of data from the [projector view](/docs/model-runs). Another option could be to select the top results of a [natural language search](/docs/natural-language-search). This way, you can use neural networks like CLIP as zero-shot classifiers. A third possibility is to select all assets that look similar to each other - thanks to [Labelbox similarity search](/docs/similarity). Similarity search powered by embeddings allows you to leverage any off-the-shelf neural network as a zero-shot classifier. There are three ways to select data rows: * Option 1: Select all filtered data rows by choosing **Select all**. * Option 2: Use the selection icon (checkbox) to manually select individual data rows. * Option 3: Bulk select data rows by selecting the first data row, press and hold `Shift`, and then select the last data row. All data rows between the first and last ones are selected. ### Step 2: Add metadata After you select your data rows, select **Add metadata** from the selection menu. ### Step 3: Pick a metadata field From the selection menu, select the metadata field you want to apply. You can search for metadata fields by typing their name if you don't see them in the dropdown. Metadata fields must exist in Labelbox before they appear in the menu. ### Step 4: Provide metadata values Enter a metadata value. This metadata field and value will apply to all selected data rows. Choose **Save** to apply the metadata in bulk. Changes can take time to be reflected in the data row metadata. ## View metadata You can surface data rows with the new metadata values, by [searching on metadata in Labelbox](/docs/search#supported-attributes-for-search-and-filter). The newly created metadata tag will also show up in the detailed view. ## Filter metadata Once you upload your metadata, you can easily filter and view metadata in Catalog. If you want to label this set of data rows, you can filter by that metadata field and send them to a labeling project as a [batch](/docs/batches). ### Metadata viewing access Labelers cannot filter and view metadata in Catalog. However, labelers will be able to view metadata information in the data row information panel. Select a data row to open its detailed view. The **Metadata** panel shows all current metadata. ## Export metadata When you export your metadata, you can sync the newly created metadata with any outside system such as a cloud bucket, data lake, data warehouse, or database. To learn how to export data rows containing metadata, see [Export from catalog](/docs/export-from-catalog). ## Delete metadata You can delete metadata for a data row through the SDK. Visit [this guide](/reference/metadata) for instructions. # Manage datasets Source: https://docs.labelbox.com/docs/datasets-datarows Instructions for managing datasets in the Labelbox UI. This guide covers how to create, modify, and manage your datasets directly within the Labelbox user interface. ## Before you start Before you can label data, you need to import it into Labelbox. Take a look at our guides for connecting your cloud data to Labelbox, then come back to this page. ## Key definitions The following are fundamental concepts for organizing your data in Labelbox. Understanding these terms is the first step to building powerful ML models. All data in Labelbox is organized into **Datasets**, which are made up of individual **Data Rows**. | Term | Definition | | :--------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Asset | The file you want to label (e.g., an image, a video, a text file). This is your core, raw data. | | Data row | The container for a single asset and all its related information. This includes the asset itself, its metadata, attachments, and any annotations (labels) created for it. | | Dataset | A collection of data rows, typically from a single source or domain. For example, you might create one dataset for "Medical Images from Device X" and another for "Customer Support Transcripts." | | Attachment | Supplementary information you can add to an asset to provide extra context for your labelers. For example, you could attach a PDF with instructions or a reference image. | | Global key | An optional but highly recommended unique ID for each data row. Using global keys helps prevent duplicate data uploads and makes it easy to map data rows in Labelbox back to your own external databases or file systems. | ## Create a new dataset 1. Navigate to the **Catalog** tab. 2. Click the **+ New** button to open the "Create a new dataset" dialog. 3. Give your dataset a clear, descriptive name (e.g., `q1-2026-night-driving-images`). 4. Optionally, add a description for more context. 5. Choose an import method: * [Connect to your cloud storage (RECOMMENDED)](/docs/connect-to-cloud-storage) * [Signed URLs](/docs/signed-urls) * [Direct upload](/docs/upload-local-files) 6. Click **Create dataset**. Best practices * Naming: Dataset names can be up to 256 characters and include letters, numbers, spaces, and `_-.,()/`. Use names that clearly explain the data's source and purpose. * Organization: Group data from a single domain or source into its own dataset. This simplifies labeling workflow setup. You can use Metadata to further organize and filter data rows within a dataset. ## Append data to an existing dataset You can add more data to a dataset at any time. 1. Go to **Catalog** and select your dataset from the list on the left. 2. Click the **Append to dataset** button. 3. You will be prompted to choose an import method to add your new data rows. ## Delete a dataset 1. In **Catalog**, select the dataset you wish to delete. 2. Click the three-dot menu and select **Delete dataset**. 3. A confirmation dialog will appear. Type `delete` and click the **Delete dataset** button to confirm. Deleting a dataset is a permanent, irreversible action. All associated data rows, annotations, metadata, and classifications will be lost. # Deprecations Source: https://docs.labelbox.com/docs/deprecations A list of active and completed deprecations ## Active deprecations This section lists all products and features which are currently in a deprecation period. *** ## Completed deprecations This section lists all products and features which have been decommissioned. These products and features are no longer available. ### Public Demo organization The public demo workspace in the Labelbox platform was created to provide sample projects for users to explore the platform's features and capabilities. It was sunsetted on April 3, 2026. ### Census integration As of June 30, 2025, we no longer support the Census integration. ### Export v2 nonstreamable methods In May 2025, we introduced changes to the Python SDK that broke compatibility with the following Export v2 non-streamable methods in version 3.67 and earlier: * `catalog.export_v2()` * `data_row.export_v2()` * `dataset.export_v2()` * `model_run.export_v2()` * `project.export_v2()` * `slice.export_v2()` If you're using any of these methods with SDK version 3.67 or earlier, please update to a newer version or migrate to the `export()` method, which provides a more scalable and performant way to export data across the Labelbox platform. To learn more, see [Export overview](/reference/export-overview). ### Automation efficiency score On April 16, 2025, we removed Automation efficiency score from the Labelbox platform. ### Catalog cluster view On February 28, 2025, we deprecated Catalog cluster view for all customers due to a lack of usage. ### Smart select In February 2025, we deprecated the **Smart select** button for selecting datasets on the Catalog page. ### Reporting (Enterprise Dashboard) In February 2025, we deprecated the Reporting page (Enterprise Dashboard) for all customers. The [Monitor tab](/docs/monitor) offers all of the same metrics and filters (plus more). Unlike the Reporting page, the Monitor page is a native solution that is built into the Labelbox platform, so it offers a more seamless and robust user experience. To learn more, read [this guide](/docs/migration-guide-reporting-page-to-monitor). ### DICOM editor Due to a lack of usage, we sunsetted the DICOM editor on November 25, 2024. ### YOLO models in Foundry On November 7th 2024, Labelbox disabled YOLO models in Foundry due to a lack of usage. Foundry still supports other object detection and segmentation model alternatives (OWL-ViT, Rekognition, GroundingDINO, GroundingDINO + SAM). If you would like to set up your own YOLO model for inferencing, refer to our [custom model integration](/docs/custom-model-integration#create-model-integrations-for-bounding-box-and-mask-tasks) docs. ### Fine-tuning On November 7th 2024, Labelbox disabled the model fine-tuning feature. This means that image fine-tuning is no longer usable. We may re-enable the fine-tuning feature in the future. ### Export V1 In early September, we disabled Export v1 for all remaining customers. All users should use the `export()` method instead. ### Data connector libraries All data connector libraries, including `labelbase`, `labelspark`, `labelpandas`, `labelsnow`, and `labelbox-bigquery` libraries, have been publicly archived and are no longer maintained. To import data from remote sources such as Databricks and Snowflake, set up [Census integrations](/docs/census-integration) directly on the Labelbox platform. ### Custom editor We sunsetted the custom editor for all customers on June 30, 2024. Custom editor projects and their associated labels are no longer accessible. Please use one of our native labeling editors instead. ### Dropdown schema Dropdown is a deprecated classification type (the other classification types are radio, checklist, and free-form text). Labelbox implemented a UI solution that provides a dropdown view for radio and checklist classifications. Therefore, we removed the capability to add the dropdown schema to new ontologies. We recommend using radio or checklist classifications and selecting the dropdown toggle instead (see image below). ### Labels tab Prior to the rollout of the [Data Rows tab](/docs/data-rows-activity), the Labels tab was a dashboard view in Annotate. The Labels tab was replaced by the Data Rows tab as part of the shift to our new data row-centric paradigm. Effective August 2023, no more customers are using the Labels tab. ### Review step Prior to the rollout of [Workflows](/docs/workflows), the review step was the mechanism for reviewing labels in Annotate. The review step was replaced by Workflows in order to offer more flexibility in customizing the review flow for labeling tasks. Effective August 2023, no more customers are using the Review step. ### Dataset-based queueing Prior to the rollout of [batch-based queueing](/docs/batches), the only way to send data rows to a labeling project was by sending the entire dataset. This was replaced by batch-based queueing to enable customers to send a curated subset of data rows across multiple data rows to a labeling project. Effective August 2023, no more customers are using dataset-based queueing. ### On-prem At the end of 2022, Labelbox decided to sunset our on-premise offering. All customers are encouraged to use our SaaS product instead. As of August 2023, no more customers are using the Labelbox on-prem offering. ### Superpixel In 2023, Labelbox released the [auto-segment tool](/docs/label-data#auto-segment), which replaced the Superpixel feature. Auto-segment is embedded into the segmentation mask tool. You can use it to auto-generate segmentation masks. As of January 2023, no more customers are using the Superpixel feature. # Connect to Discord Source: https://docs.labelbox.com/docs/discord Follow these instructions to set up your Alignerr account with Discord. As part of our onboarding and communication process, all Alignerrs are required to sign up for Discord. Going forward, Discord will be our official platform for project updates, team communications, and key announcements. # Refresh membership If you *already have* an Alignerr Discord account, follow these steps: 1. Log onto [app.alignerr.com](http://app.alignerr.com). 2. Go to **Settings** → **Discord** → click **Refresh Discord membership**. Refresh Discord Membership Pn 3. Sign into Discord using your existing credentials. # Create/link account If you *do not have* an Alignerr Discord account, follow these steps: 1. Log onto [app.alignerr.com](http://app.alignerr.com). 2. Go to **Settings** → **Discord** → select **Link Discord account**. Screenshot2025 09 15at8 55 19AM Pn 3. Select **Authorize** in the Alignerr Manager modal. Authorize Discord Pn 4. Create your Discord account using the email address you use for Alignerr projects. Screenshot2025 09 11at2 19 36PM Pn 5. You will receive an email from Discord. Open the email and click **Verify email**. Verify Email Pn 6. When you enter Discord, you’ll see a modal that says **Create Your First Discord Server**. Exit out of that modal. Do not select anything. Screenshot2025 09 11at2 25 55PM Pn # Documents Source: https://docs.labelbox.com/docs/document-editor Guide for labeling document (PDF) data. The **Documents** editor lets you annotate and review PDF documents for tasks such as document review, text extraction, and data labeling for machine learning. You can navigate multi-page documents, manage annotation layers, and track progress efficiently. The editor also supports automatic optical character recognition (OCR) with the ChatGPT o1 model to detect and extract text from selected areas. ## Set up document annotation projects To set up a document annotation project: 1. [Create a document (PDF) dataset](/docs/datasets-datarows). 2. On the [Annotate projects page](https://app.labelbox.com/projects), click the **+ New project** button. 3. Select **Documents**. Provide a **name** and an optional **description** for your project. 4. Click **Save**. The system then creates the project and redirects you to the project overview page. 5. Click **Add data**. Then select your audio dataset. Click **Sample** to sample your dataset, or you can manually select data rows and click **Queue batch**. To learn how to import documents using the SDK, see [importing document data](/reference/import-document-data). ### Data row size limit To view the maximum size allowed for a data row, see [limits](/docs/limits). ### Image encoding If your PDF files contain images, use JPEG encoding and RGB colorspace for color images. ## Supported annotation types Below are the annotation types you may include in your ontology for labeling document data. Classification-type annotations can be applied globally or nested within a bounding box or entity annotation. | Feature | Import annotation | Export annotation | | --------------------------------- | --------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- | | **Bounding box** | [See payload](/reference/import-document-annotations#bounding-box) | [See payload](/reference/export-document-annotations#bounding-box) | | **Entity** | [See payload](/reference/import-document-annotations#entity) | [See payload](/reference/export-document-annotations#text-entity-named-entity) | | **Relationship** | [See payload](/reference/import-document-annotations#relationship) | [See payload](/reference/export-document-annotations#relationship) | | **Radio classification** | [See payload](/reference/import-document-annotations#classification---radio-single-choice) | [See payload](/reference/export-document-annotations#classification---radio) | | **Checklist classification** | [See payload](/reference/import-document-annotations#classification---checklist-multi-choice) | [See payload](/reference/export-document-annotations#classification---checklist) | | **Free-form text classification** | [See payload](/reference/import-document-annotations#classification---free-form-text) | [See payload](/reference/export-document-annotations#classification---free-form-text) | All PDF documents support bounding box annotations. To create other annotations, PDF documents *must* have text boxes before you upload them to Catalog. For best results, verify text layers *before* uploading PDFs. ### Bounding box To create a bounding box, use your cursor to create the shape around a character, word(s), or section in the document. To reposition the bounding box on the document, click + hold, then use your mouse or trackpad to reposition the annotation on the document. You can also click + drag the corners to resize the bounding box. ### Entity To create an entity annotation, click the desired starting character and drag to select a sequence of characters in the text. Characters are not restricted to a single class; entity annotations may overlap completely or partially. Entities may also span multiple pages. To edit an entity's class, right-click the entity and select **Change class**. Shortcut: In the **Tools** panel, you will see a numerical hotkey next to the name of the annotation. Use the specified number hotkey (e.g., `1`, `2`, `3`) to activate the entity tool. To create another entity, press the number hotkey again to activate the tool, then create another entity. Once all entities have been created, press `E` to submit your label. #### Token selection We also support tokenization, so you can create and highlight entities at both word and character levels, which is determined by the data in your JSON upload. Clicking on a specific word will highlight the entire word. This is helpful when labeling text, as it can be easy to accidentally miss certain characters or words when highlighting. ### Relationships To create a relationship between annotations: 1. Select a relationship tool and hover over the source annotation of the relationship to reveal the annotation's anchor points. 2. Click an anchor point to create the starting point of the relationship, then bring your mouse over to the annotation you want to relate it to, hovering over it to reveal its anchor points. 3. Click one of the anchor points to complete the relationship. Right-click a relationship to change its direction, make it bi-directional, or delete it altogether. #### Relationships for annotations across pages If you want to create an annotation relationship for annotations that exist on different pages, you will need to follow a slightly different workflow: 1. Select the relationship tool. 2. Go to the annotation where you want to start the relationship, right-click, and click **Select relationship start**. 3. Scroll to your destination annotation, right-click, and click **Select relationship end**. After you have selected both the starting and end annotation of the relationship, your relationship will be established. ### Radio classification Create a radio classification by activating the classification question and inputting the answer value. In the below example, press `8`, `k`, and `esc` to complete the radio classification. Once all classifications have been completed, press `e` to submit your label. ### Checklist classification Create a checklist classification by activating the classification question and inputting the answer value(s). In the below example, pressing `7` and pressing `Down` + `Enter` on the answer values completes the checklist classification. Once all classifications have been completed, press `e` to submit your label. ### Free text classification Create a free text classification by activating the classification question and inputting the answer value. In the below example, pressing `6`, typing the answer value, and pressing `Enter` completes the free text classification. Once all classifications have been completed, press `e` to submit your label. ## Enable auto-OCR The in-editor automatic OCR support allows you to recognize and extract text from PDFs using bounding boxes. To enable it, add a bounding box feature and toggle on the Automatic OCR option. When you draw a bounding box on the document, OCR automatically detects and extracts text within the selected area. The extracted text is stored as a text subclassification. ## Custom text layers A unique aspect of our document editor is the ability to view text layers. You can toggle the text layer on and it will appear whenever you want to highlight an entity. ## Navigate the document Use your mouse scroll wheel or trackpad to move forward and backward through the pages of the document. To jump to a specific page, highlight the current page number in the top navigation bar, type your desired page number, and press `Enter`. To zoom in, press `Z` and click on the section of the page you want to zoom in on. To zoom out, press `Opt` + `Z` and click on the page, or press `Shift` + `Z` to return the page to its original zoom level. ## Document-specific hotkeys | Function | Hotkey | Description | | --------------- | ------------- | ---------------------------- | | Show Text Layer | `Shift` + `T` | Show or hide the text layer. | # Similarity search with embeddings Source: https://docs.labelbox.com/docs/embeddings Learn how to use embeddings to perform a similarity search in Labelbox. Embeddings are numerical representations of your data that make it possible to find items that are visually or contextually similar. This is especially useful for unstructured data like images, text, and video, where you can't use traditional search methods. In Labelbox, this is called a similarity search. By using similarity search, you can quickly find more examples of rare edge cases, identify and remove bad data, and ultimately build a higher-quality dataset for training your models. ## How to use similarity search The easiest way to get started with similarity search is to use the *pre-computed embeddings* that Labelbox provides for common data types. **To perform a similarity search:** 1. **Find an anchor**: In the Catalog, hover over a data row you want to find more examples of and click the **Find similar data** icon. This initial data row will be your "anchor". Find Similar Data 2. **Add more anchors**: You can optionally select multiple data rows to use as anchors and then select **Add selection as anchors**. 3. **Refine your results**: * **Add or remove anchors**: To add more examples and refine the search, select more data rows and click "Add selection to anchors". To remove an anchor, click on "Anchors (n)" to view all anchors, and click the "—" icon on the one you want to remove. * **Adjust the similarity score**: You can adjust the range of the similarity score (from 0 to 1) to broaden or narrow your search. A higher score means the results will be more similar to your anchors. Add Anchors And Refine Filters | Asset type | Pre-computed embedding | | -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Image | [CLIP-ViT-B-32](https://huggingface.co/sentence-transformers/clip-ViT-B-32) (512 dimensions) | | Video | [Google Gemini Pro Vision](https://ai.google.dev/gemini-api/docs/models). First two (2) minutes of content is embedded. Audio signal is not used currently. This is a paid add-on feature available upon request. | | Text | [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) (768 dimensions) | | HTML | [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) (768 dimensions) | | Document | [CLIP-ViT-B-32](https://huggingface.co/sentence-transformers/clip-ViT-B-32) (512 dimensions) and [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) (768 dimensions) | | Tiled imagery | [CLIP-ViT-B-32](https://huggingface.co/sentence-transformers/clip-ViT-B-32) (512 dimensions) | | Audio | [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) (768 dimensions). Audio is transcribed to text. | | Conversational | [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) (768 dimensions) | ## How to use your own embeddings If you have your own embeddings, you can upload them to Labelbox to use in similarity searches. View the [limits](https://docs.labelbox.com/docs/limits) page to learn the custom embedding limits per workspace and the maximum dimensions allowed per custom embedding. **To create and upload a custom embedding:** 1. Create an [API key](https://docs.labelbox.com/reference/create-api-key). 2. Navigate to **Schema > Embeddings**. 3. Click **+ Create** and give your embedding a name and specify the number of dimensions. 4. Once created, you can use this [Google Colab notebook](https://colab.research.google.com/drive/159lWZzY3wtGacLjwfPuiqdz7eaQ8TfXj?usp=sharing) to upload your custom embeddings. You can view all of the embedding fields in JSON format by clicking the **\** button. To delete a custom embedding, you can do so from the same **Schema > Embeddings** page by selecting the embedding and clicking the gear icon. To learn more about custom embeddings, see our [Custom embeddings developer guide](/reference/custom-embeddings). ## Advanced search techniques To further refine your data curation workflow, you can combine a similarity search with other filters in the Catalog, such as metadata, annotations, or datasets. Once you have a set of filters that you want to reuse, you can save them as a "slice". Slices are dynamic, so any new data that matches your filter criteria will automatically be added to the slice. # Explore your data in Catalog Source: https://docs.labelbox.com/docs/explore-your-data-in-catalog Welcome to Labelbox Catalog, your command center for understanding, curating, and preparing unstructured data for your machine learning workflows. Before you can build a high-performing model, you need to deeply understand the data you're working with. Catalog is designed to move you from raw data to curated, label-ready datasets with confidence and speed. Think of Catalog as an interactive, searchable index of all your training data. It provides the tools to not just see your data, but to interact with it, ask questions of it, and organize it in powerful ways. ## What can you do with Catalog? * **Visualize and explore your data to uncover insights:** Instead of guessing what’s in your dataset, you can directly visualize it. Spot imbalances, identify outliers, find rare edge cases, and understand the distribution of your data before it ever touches a model. Use the gallery view for a visual survey, the list view for metadata analysis, and the analytics view to see statistical breakdowns. * **Find specific data with powerful search and filtering:** Move beyond simple filename searches. Catalog allows you to build complex queries to find the exact data you need. You can filter by a rich set of attributes including metadata, annotation-class, dataset, project, and even the content of the data itself using AI-powered search methods. * **Curate and organize datasets for any workflow:** Your raw data is just the beginning. Catalog helps you organize it for specific tasks. You can create static **batches** of data to send to a labeling project or define dynamic **slices** that automatically track specific subsets of your data over time, like "all images flagged for review." * **Take targeted action on your data:** Finding data is only half the battle. Catalog is fully integrated with the rest of the Labelbox platform, allowing you to take immediate action on your findings. Select a group of data rows and, with a few clicks, you can add metadata, export them for analysis, or send them directly to a labeling project. By providing a single, unified interface to explore and manage all your unstructured data, Catalog empowers you to make smarter, data-driven decisions throughout the entire model development lifecycle. ## Key concepts To effectively navigate and use Catalog, it's important to understand its core components. These are the fundamental building blocks you'll encounter as you explore and manage your data. | Term | Definition | | ---------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | Data rows | **What it is:** The most basic unit in Labelbox, representing a single item of your data. A *data row* is a pointer to a single data asset (e.g., an image, a video, a text file, a medical image) along with all its associated information, including metadata, attachments, and annotations.
**Why it's important:** Every action in Catalog—from filtering to labeling—is performed on one or more data rows. | | Datasets | **What it is:** A *dataset* is a top-level container that holds a collection of data rows, often grouped by a specific project, data source, or collection period. You organize your data into datasets when you upload it to Labelbox.
Why it's important: Datasets provide the initial organization for your data and serve as the starting point for exploration in Catalog. | | Slices | **What it is:** A *slice* is a dynamic, saved query that represents a subset of your data. Think of it as a "smart folder" or a saved search that always stays up-to-date.
**Why it's important:** Slices let you continuously monitor specific subsets of your data without re-applying filters. When new data is added to the dataset that matches the slice's filters, it automatically appears in the slice. This is perfect for tracking data quality issues or monitoring for specific edge cases. | | Batches | **What it is:** A *batch* is a static, fixed group of data rows that you can send to a labeling project.
**Why it's important:** Batches are the primary mechanism for queuing up work for human labelers. Once created, a batch does not change unless you manually add or remove items, providing a stable workload for your labeling projects. | | Embeddings | \*\*What it is: \*\**Embeddings* are powerful numerical representations (vectors) of your data that capture its semantic meaning. They are the engine behind Catalog's AI-powered search features. Labelbox can generate these for you, or you can provide your own.
**Why it's important:** Embeddings allow you to search for data based on meaning and similarity, not just metadata. They power features like **similarity search** ("find more images like this one") and **natural language search** ("find images of a dog playing in a park"), making it possible to find relevant data in a more intuitive and powerful way. | ## Gallery view The gallery view is the default, providing a visual grid of your data. It's designed for rapid visual scanning and exploration. * **Best for:** Getting a high-level visual sense of your dataset, spotting visual outliers, discovering trends, and making quick selections based on what the data looks like. * **How to use it:** Simply scroll through the grid to explore. You can click on any data row to open the detailed view or use `Shift + Click` to select a range of items. ## List view The list view organizes your data rows in a familiar table format, emphasizing metadata over visual appearance. * **Best for:** Analyzing and sorting your data by specific metadata attributes, comparing values across data rows, and finding data with specific metadata characteristics. * **How to use it:** 1. Switch to the list view using the view-selector icon. 2. Click on any column header to sort the data by that attribute. 3. Use the **Manage columns** button to customize which metadata fields are displayed in the table, tailoring the view to your specific task. ## Analytics view The analytics view provides interactive charts and graphs that summarize the distribution of data in your current selection. * **Best for:** Understanding the composition of your dataset, identifying class imbalances, and seeing how your data is distributed across different metadata values. * **How to use it:** 1. Switch to the analytics view using the view-selector icon. 2. The view will display histograms and charts for your data's attributes. 3. You can click on a bar in any chart (e.g., a specific annotation class) to automatically filter your dataset down to only the data rows with that attribute. ## Detailed view The detailed view allows you to perform a deep dive into a single data row. * **Best for:** Inspecting an individual data asset, viewing all of its associated metadata, examining its annotations, and accessing all related information in one place. * **How to use it:** Simply click on any data row from the gallery, list, or cluster view to open the detailed view. Here you can see the asset itself, edit metadata, view annotation history, and more. The display panel allows you to customize these display settings: | Display setting | Description | | ----------------------------------------- | --------------------------------------------------------------------------------------------------- | | Pin data row details | Pins key details in the gallery view. | | Black color fill image | Replace the data row with black pixels. This is useful to see annotations and predictions stand out | | Show objects | Show the objects in the preview. Supported for predictions and ground truth. | | Show segmentation masks | Show segmentation masks in the preview. Supported for predictions and ground truth. | | Show classifications | Show classifications in the preview. Supported for predictions and ground truth. | | Conversational text formatting - Raw | Display conversational text data rows in raw, unformatted version. | | Conversational text formatting - Markdown | Display conversational text data rows in markdown. | | Segmentation colors: Agree, Disagree | When **Show segmentation masks** is toggled ON, display the agree/disagree in different colors. | # Export model run data Source: https://docs.labelbox.com/docs/export-data-for-model-training Instructions for exporting model run data from the app UI to train a model in your desired computing environment. **Export specifications**: There are three ways you can export data from Labelbox: export from Catalog, export from a model run, and export from a labeling project. This guide explains how to export data from a model run. If you choose to train a model in your custom ML environment outside of Labelbox, follow the instructions below to learn how to export your data from a model run. Alternatively, you can train a model in the Labelbox UI using the model training integration (see [How to train a model in Labelbox](/docs/integration-with-model-training-service)). ## How to export via the app UI To export annotations from a model run via the UI, go to **Model** > select a model from the **My models** section > **Model runs** tab. You can use the filters to query data row label status, metadata, batch, annotations, and workflow history. Note that excluding fields from your export will make the export process faster and the export file smaller. To export your annotations via the Labelbox UI, follow these steps: 1. Select a model run. 2. Configure your export ### Option 1: Export all data rows Click on **All data (X number of data rows)**, then click the **Export data** in the drop-down menu. ### Option 2: Export a split or a slice 1. On the left side of the Model Run page, click on one of Slices or Splits options (Training, Validation, Test, Unlabeled). 2. Click on **Slice/Split name - X number of data rows**, then click the **Export data** in the drop-down menu. ### Option 3: Filter data rows 1. You can also build your filters in the Model filters to query for data rows of the desired context. 2. Click **Select all** in right side. 3. Then click the **Manage selection (X number of data rows)** dropdown, and click **Export data** to export the data rows. ### Option 4: Select specific data rows 1. Hand-select data rows to export using the check boxes next to each data row. 2. Then click the **Manage selection (X number of data rows)** dropdown, and click **Export data** to export the data rows. 3. After you click **Export JSON**, you will see a notification banner telling you that you can track the progress of the export job in the **Notifications** center. 4. Once the job is complete, download the export file by clicking the **Download** link. # Export data from Catalog Source: https://docs.labelbox.com/docs/export-from-catalog Instructions for exporting your data from Catalog via the UI. **Export specifications**: **Developer guide**: There are three ways you can export data from Labelbox: export from Catalog, export from a model run, and export from a labeling project. This guide explains how to select and export data from Catalog. Exporting from Catalog allows you to include the project and/or model run information when you export the data rows. ## How to export via the app UI From the **Catalog** tab, you can apply and combine filters to query data rows based on similarity, natural language search, annotations, metadata, and more. Then, you can choose to include the associated project and/or model run information. Note that excluding optional fields from your export will make the process faster and the export file smaller. To export data rows from Catalog, follow these steps: ### Step 1: Select data rows Navigate to **Catalog**. Select a filter if you wish to narrow down the data rows you wish to export. Click **Select all** in the top-right corner. Open the dropdown under **Manage selection (X number of data rows)** and select **Export data**. ### Step 2: Select export fields to include Select the optional fields that you wish to include in the export. Definitions for these fields can be found in the [Export Glossary](/reference/export-glossary). ### Step 3: Include project data When you select **Export labels from project**, you will be prompted to select from the dropdown one or more projects. Only projects in which one or more of the selected data rows have been labeled will appear in the dropdown. For the selected projects, all labels made in the project will be included in the NDJSON for each respective data row. ### Step 4: Include model run data When you select **Export labels and predictions from model runs**, you will be prompted to select from the dropdown one or more model runs. Only model runs in which one or more of the selected data rows appear will appear in the dropdown. For the selected model runs, all labels and predictions made in the model run will be included in the NDJSON for each respective data row. ## How to specify data row source ### Option 1: Export from all datasets 1. Select **All datasets** in the top-left corner. 2. Apply a filter or combination of filters. 3. Click **Select all** in the top-right corner. 4. Click on **Manage selection (X number of data rows)**, then select **Export data** in the dropdown menu and select any desired optional fields. ### Option 2: Export from one dataset 1. Select a dataset from the list of datasets on the left side menu. 2. Apply a filter or combination of filters, if desired. 3. Click **Select all** in the top-right corner. 4. Click on **Manage selection (X number of data rows)**, then select **Export data** in the dropdown menu and select any desired optional fields. ### Option 3: Export from a slice 1. Select **Slices** in the toggle on the left side menu. 2. Select an existing slice. 3. Modify or compliment the filters that comprise the slice, if desired. 4. Click **Select all** in the top-right corner. 5. Click on **Manage selection (X number of data rows)**, then select **Export data** in the dropdown menu and select any desired optional fields. ### Option 4: Export specific data rows 1. Hand-select data rows to export using the checkboxes in the top-left corner of the thumbnail of each data row. 2. Click on **Manage selection (X number of data rows)** in the top-right corner, then select **Export data** in the dropdown menu and select any desired optional fields. # Export labels from project Source: https://docs.labelbox.com/docs/export-labels Instructions for exporting your annotations from a labeling project via the UI. **Export specifications**: You can export data from Labelbox in three ways: export from Catalog, export from a model run, and export from a labeling project. This guide explains how to select and export annotations from a labeling project. ## Step 1: Select the data rows for export To export annotations from a labeling project via the UI, go to **Annotate** and select a project that has *at least one* label in the **Completed** column. Within the project, go to the **Data Rows** tab. You have several options for specifying the data rows to include in the export. The options for exporting from a project are listed below. ### Option 1: Export all data rows There are two ways to export all data rows from the **Data Rows** tab: 1. Select **All (X data rows)**, then select **Export data** from the dropdown menu. 2. Click **All (X data rows)**. Then, tick the checkbox in top left of the table header. Finally, from dropdown menu **X selected**, select the **Export data**. ### Option 2: Quick filter by label status Follow these steps to filter and export data rows by label status. 1. Next to the "Search your data" field, select a label status (**All**, **In review**, **In rework**, **Skipped**, **Done**, **To label**). You may only select one label status at a time. 2. In the table, hand-select the data rows to include in the export (or tick the checkbox in top left of the table header). 3. From the menu **X selected**, select **Export data**. ### Option 3: Filter by data row attributes After selecting the label status, you can build your filters in the Data Rows tab to query for data rows that meet certain attributes (e.g., Dataset, Batch, Find text, Annotation, Metadata, etc). To filter and export data rows, follow these steps: 1. Click **Search your data** to display the filters. 2. Add filters to narrow down your search results. 3. In the table, select the data rows to include in the export. 4. From the menu **X selected**, select **Export data**. ## Step 2: Select fields to include in the export file After you narrow down the data rows to export, you can specify which fields to include in the export file. Follow the steps below to select additional fields for export. 1. When you select **Export data**, you will see a panel with additional date range queries and filters. 1. Choose the fields to include or exclude in your final export or click **Select all**. For descriptions of these export fields, visit our [export specifications](/reference/export-overview). 2. When you have selected your fields, select **Export JSON**. 2. After you select **Export JSON**, you will see a notification banner instructing you to check the status of the export job in the **Notifications** center. 3. To access notifications, hover over the bell icon at the bottom of the left navbar, then select **Notifications**. 4. From the **Notifications** center, use the filters to find your export task. 5. Download the export file by selecting the **Download** icon. # Export model predictions Source: https://docs.labelbox.com/docs/export-predictions Describes how to export Foundry predictions and the options you can choose while doing so. Exporting predictions from Foundry allows you to use your model's output outside of the Labelbox platform. This is useful for a variety of tasks, such as: * **Offline analysis:** Analyze model performance in a Jupyter notebook, Excel, or other data analysis tools. * **Integration with other systems:** Feed predictions into downstream applications or custom MLOps pipelines. * **Custom reporting:** Create detailed reports and visualizations on model accuracy and behavior. You can start an export from three main areas in the Labelbox platform, depending on what you need: | Location | Use case | | -------- | ------------------------------------------------------------------------------------------------------------ | | Catalog | Best for when you want to search for and select specific data rows that have predictions you want to export. | | Model | Ideal for when you need to export predictions from a particular model run. | | Annotate | Use this when you want to export predictions for data rows that are part of a specific project. | ## Step-by-step instructions Follow these simple steps to export your predictions: ### Step 1: Select your data First, navigate to the **Catalog**, **Model**, or **Annotate** section and select the data rows with the predictions you want to export. ### Step 2: Start the export From the **Manage selection** menu that appears, choose **Export data**. This will open the **Export** panel where you can customize your export. ### Step 3: Choose your export options In the **Export** panel, you can select what information you want to include in your export file. Depending on where you export your data from (i.e., Catalog, Annotate, Model), you can choose to include these details in your export: * Data row details * Metadata * Attachments * MMC code executions * Project details * Performance details * Label details * Interpolated frames * **Model run details** * **Predictions** * Embeddings * Model type override * Export labels from projects * **Export labels and predictions from model runs** ### Step 4: Generate the export file Once you've selected your options, click **Export JSON**. Labelbox will then start building your export file. ### Step 5: Download your results When your export is ready, you'll see a **Download** link in the **Notifications Center**. Clicking this will give you two options for downloading your results: | Download method | Description | | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Browser download | This is the quickest way to get your file.
Click the **Download** button to save the results directly to your computer. **Important:** Don't close your browser until the download is complete, or the file will be incomplete. | | Python SDK script | For large datasets, it's better to download using a Python script. The download panel will provide you with a sample script you can copy. This allows the download to happen in the background, so you don't have to worry about closing your browser. You'll need a Labelbox API key to run the script.
Click the **Code sample** button to view a sample Python script to download your export results based on the selected options. | # Find text Source: https://docs.labelbox.com/docs/find-text Search for text and uncover assets in Catalog. When working with text data such as [text](/reference/import-text-annotations), [conversational text](/reference/export-conversational-text-annotations), [documents](/reference/import-document-annotations), or [HTML](/docs/html-editor), you can use the **find text** filter in Catalog to surface data rows that contain specific keywords. ## How the find text filter works When you use the **find text** filter, you are allowed to specify multiple keywords in the expression. Labelbox will then surface data rows that contain those words. ### Character/word limits To view the minimum & maximum characters/words for the *find text* filter, visit our [limits](/docs/limits) page. If multiple words are provided in the expression, Labelbox will surface data rows in this order: 1. Data rows containing all words 2. Data rows containing all words except one 3. Data rows containing all words except two ... and so on until all results containing at least one of the provided words are returned. ## Supported media types The filter is supported for the following data types: | Media type | Limit | | -------------------------------------------------- | -------------------- | | [Text](/docs/text-editor) | First 64k characters | | [Conversational text](/docs/conversational-editor) | First 64k characters | | [HTML](/docs/html-editor) | First 64k characters | | [Documents](/docs/document-editor) | First 64k characters | The limits above indicate the last character in the media file that the filter will reach. For example, with the current limits, if the keyword you are searching for appears after the 64,000th character in the media file, the data row will not appear in the filter results. ## Combine find text with other filters You can combine the **find text** filter with other filters in Catalog. Some filters are best used for targeting *unstructured* data and others are best for targeting *structured* data. Combine the **find text** with the following filters to target data rows by structured data: Combine **find text** with the following filters to target data rows by unstructured data: ## Automate data curation with slices After you populate the Catalog filters, you can save these filters as a [slice](/docs/slices) of data. When you save a filter as a slice, you will not need to populate the same filters over and over again. Also, slices are dynamic, so any new incoming data rows will automatically appear in the relevant slices. Read through the following resources to learn how to take action on the filtered data. # Labelbox Foundry Source: https://docs.labelbox.com/docs/foundry Welcome to Labelbox Foundry, your integrated environment for leveraging foundation models and custom AI to accelerate your data labeling and model development workflows. Foundry is designed to help you move from raw, unlabeled data to high-quality training data with speed and efficiency. With Foundry, you can: * **Generate high-quality predictions:** Use state-of-the-art foundation models to automatically generate labels for your data. * **Compare model performance:** Run multiple models on the same data to evaluate and compare their performance, helping you choose the best model for your use case. * **Automate labeling workflows:** Set up Foundry Apps to create repeatable, automated labeling pipelines that can be triggered from the Catalog or via API. * **Active learning workflows:** Identify the most valuable data to label by finding where your model is least confident, and send those assets directly to a human labeling project. * **Integrate custom models:** Bring your own models into the Labelbox ecosystem to take advantage of Foundry's powerful workflow tools. Foundry helps you close the gap between your model's capabilities and your production needs, streamlining the path to building better AI. ## Key concepts | Concept | Description | | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Model run | The core process in Foundry. A model run is a single execution of a model on a selected batch of data rows. It generates predictions for each data row. | | Foundry app | A saved model run configuration. Foundry Apps allow you to save a specific model and its settings so you can easily and repeatedly run it on new batches of data, automating your workflows. | | Predictions | The output generated by a model run. Predictions are the labels or values that the model "predicts" for your data (e.g., bounding boxes, classifications). | | Ontology | The schema that defines the features your model can predict. You will map the model's output to your project's ontology to ensure consistency. | ## Before you begin To ensure a smooth experience with Foundry, please complete the following setup steps: 1. [**Connect your cloud data**](/docs/connect-to-cloud-storage)**:** Foundry works directly with your data stored in the cloud. Make sure you have successfully connected your AWS, GCP, or Azure cloud storage to Labelbox. 2. [**Create a project**](/docs/create-a-project)**:** You'll need a Labelbox project set up with a defined ontology. This project will be the destination for predictions that need human review. 3. **Select your data:** In the Labelbox Catalog, identify and select the data rows you wish to use for your model run. ## Add Foundry to your workspace Foundry is available for all plans except for Educational subscriptions. * **Self-Service (Free, Starter):** Go to your workspace settings to enable Foundry as an add-on. You will be prompted to agree to the terms and add a payment method if needed. * **Enterprise:** Please contact your account manager to enable Foundry. ### How billing works Foundry billing has two components: Inference Costs and Labelbox Units (LBUs). | Cost type | Description | | --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Inference costs | This is a direct cost for using a Labelbox-hosted model, charged in USD. It varies by model and the amount of data processed. These fees are passed on to you and charged immediately after a model run. | | Labelbox units (LBUs) | Your Labelbox subscription includes a set number of LBUs. Foundry consumes LBUs when you run a model and when you send predictions to a labeling project. This usage is deducted from your subscription's LBU balance. | For example: 1. Suppose you import 1,000 images into Catalog; this consumes 17 LBUs. 2. You select 500 images and use a Foundry model run to generate predictions. Based on the model you selected and the parameters of your model run, this generates a \$2.00 inference fee. 3. When the Foundry model run is complete, 500 images and their predictions are now available in Model. This consumes 100 LBUs. 4. To verify the predictions, you send them to Annotate for human review. This consumes another 500 LBUs. Overall, you've generated \$2 in inference costs (Step 2); this is charged immediately. You've also used 600 LBUs as one time charges (Model and Annotate) and generate a 17 LBU charge each month your data remains in **Catalog**. The LBU consumption is charged against the terms of your subscription at the end of the current billing cycle. Compute fees depend on the specific model used, the amount of data processed, and other factors. For details, consult the model card. To learn more about LBUs, see [Labelbox Units (LBUs)](https://docs.labelbox.com/docs/billing#labelbox-units-lbus). You can review the specific costs for any model by selecting it in the Model gallery and viewing the **Pricing** details on its overview tab. The total cost of any completed model run can be found in the run's details. ## Remove Foundry from workspace You can unsubscribe to Foundry and remove it from your Labelbox subscription at any time. To do so: 1. Select **Workspace Settings** from the Labelbox main menu to open the **Organization Settings**. 2. From the **Billing tab**, locate the **Add-ons** section and then select **Remove**. 3. A confirmation prompt asks you to confirm your request. Select **Unsubscribe** to do so. Removing Foundry from your subscription doesn't affect your data. Any predictions created during the subscription remain. # Create a Foundry app Source: https://docs.labelbox.com/docs/foundry-apps Foundry apps help you automate Foundry workflows. Learn how to create and manage Foundry apps. As you move from experimenting with models to building production-grade AI systems, you need tools that provide consistency, scalability, and automation. **Foundry Apps** are the key to achieving this in Labelbox. A Foundry App is more than just a saved model; it's a *saved, reusable, and executable workflow configuration*. It captures the entire logic of a model run—the specific model, the ontology mapping, the confidence threshold, and naming conventions—and packages it into a single, on-demand tool. ## Create a Foundry app There are two main ways to create a Foundry app in Labelbox: * **From the Model section**: This is the most direct way to create a new app. * **From a model run configuration**: If you've already configured a model run that you'd like to reuse, you can save it as an app. ### Option 1: Create an app from the Model tab 1. Sign in to Labelbox and select **Model** from the main menu. 2. In the Model gallery, click the **Create** menu and select **App**. 3. Choose the foundational model you want to use for your app. 4. Give your app a name and an optional description, then click **Proceed**. 5. Define the parameters for your model run. If you have a previous model run with settings you'd like to reuse, you can use the **Load model run config** option. 6. Optionally, you can select up to ten data rows to preview how the app will work. To do this, click **Select data for preview**, choose your data rows, and then click **Import selected**. 7. When you're finished, click **Save & Create**. ### Option 2: Save a model run configuration as an app 1. Start a regular Foundry model run by selecting your data rows and a model. 2. Configure your model run settings as you normally would. 3. Before submitting the model run, click the **Save as App** button, which appears near the model's name. 4. You'll be prompted to enter a name and description for your new app. 5. Click **Proceed** to create the app. The **Save as App** button will not be displayed for models that are not compatible with Foundry apps. ## Update a Foundry app To make changes to an existing Foundry app: 1. Sign in to Labelbox and go to the **Model** section. 2. Click on the **Apps** tab and select the app you want to update. 3. Adjust the app's settings as needed. 4. When you're done, click **Save & Update**. ## Delete a Foundry app If you no longer need a Foundry app, you can delete it. Please be careful, as this action cannot be undone. 1. Sign in to the Labelbox app and go to the **Model** section. 2. From the **Apps** tab, select the app you want to delete. 3. Go to the **Settings** tab and click the **Delete app** button. 4. A confirmation prompt will appear. Type the word "delete" and then click **Delete**. Use care when deleting apps. Deleted apps cannot be recovered or restored. # Create pre-labels with Foundry model-assisted labeling Source: https://docs.labelbox.com/docs/foundry-model-assisted-labeling Use an off-the-shelf Foundry model to pre-label your data rows in your project. Model-assisted labeling is a feature in Labelbox that uses AI to automatically generate labels on your data. This can help you label your data faster and more accurately. Here's how it works: * **You provide a model**: You can choose from a list of pre-trained models in Labelbox or bring your own. * **The model predicts labels**: The model will analyze your data and suggest labels based on what it has learned. * **You review and approve**: You can then review the suggested labels and make any necessary corrections. This process can significantly speed up your labeling workflow, especially for large datasets. ## How to enable model assisted labeling Here's a step-by-step guide to using model-assisted labeling in your Labelbox project. 1. **Set up a new project**: To get started, create a new project in Labelbox. Make sure your project has an ontology and a batch of data that is compatible with Foundry. Once your project is set up, the **Model assisted labeling** button next to **Start labeling** will be enabled. 2. **Choose your model**: Click on the **Model assisted labeling** button. You will be taken to a page where you can choose the model you want to use. Labelbox offers a variety of models to choose from. The models at the top of the list are the easiest to set up, but you can also use one of the other models if you prefer. 3. **Configure your model**: After you have selected a model, you will need to configure it. This involves two main steps: * **Align the model with your ontology**: This means mapping the labels that the model can predict to the labels in your project's ontology. * **Set the confidence threshold:** This is a value between 0 and 1 that tells the model how certain it needs to be before it suggests a label. A higher confidence threshold will result in fewer, but more accurate, suggestions. If your ontology is open-ended, you can also provide your own criteria to the model, similar to prompt engineering. 4. **Preview the selected model**: Before you apply the model to your entire dataset, you can generate a preview to see how it performs on a small sample of your data. This is a great way to test your model and make sure it is performing as you expect. During the preview, you can: * **Adjust the confidence threshold** to find the optimal setting for your use case. * **Highlight annotations** to see which tool was used and the model's confidence for each annotation. If the model is not performing well, you can go back and choose a different model. If you are happy with the results, click **Submit**. 5. **Submit the model run**: After you submit the model, it will start running on your data. You can track the progress of the model in the notification tab at the bottom left of the page. When the model has finished running, you will see a confirmation in the Notifications tab and the results will be shown in your project. 6. **Review the pre-labels**: Once the model has finished running, the suggested annotations will be superimposed on your data rows. You can then review the suggestions and either submit them as they are or remove them and create your own annotations. # HTML Source: https://docs.labelbox.com/docs/html-editor A guide for using the native HTML editor to label data. When creating a project, select HTML.
You can use the native HTML editor to customize the way your data rows render in the labeling interface. You can use this guide to understand how the HTML editor works, how to set up an HTML labeling project, and common configurations for the HTML editor. ## Import HTML data To learn how to prepare and format your HTML data for importing to Labelbox, see [Import HTML data](/reference/import-html-annotations). You have two options for uploading to Labelbox: 1. Store your HTML files in your cloud storage and send Labelbox a JSON file containing URL links to the HTML files (recommended) 2. Upload the HTML files directly to Labelbox Note: To label a webpage, you must save it as an HTML file. Direct linking to web pages via a URL is not supported. Please take a look at the Common HTML editor configurations section below to explore ways to configure your import file to render your data in the editor. ## Supported annotation types Create (or reuse) an ontology for your labeling project. The table below shows the annotation types that are supported for labeling HTML data. Click through the links in the table to view the import and export formats for each classification type. Note: You can use [Model-assisted labeling](/docs/model-assisted-labeling) to import classifications as prelabels to your HTML data rows. | Feature | Import format | Export format | | --------------------------------- | --------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | | **Radio classification** | [See payload](/reference/import-html-annotations#classification-radio-single-choice) | [See payload](/reference/export-html-annotations#classification---radio) | | **Checklist classification** | [See payload](/reference/import-html-annotations#classification-checklist-multi-choice) | [See payload](/reference/export-html-annotations#classification---checklist) | | **Free-form text classification** | [See payload](/reference/import-html-annotations#classification-free-form-text) | [See payload](/reference/export-html-annotations#classification---free-form-text) | Note: HTML files are rendered in Labelbox as an iframe with the sandbox attribute. This means that scripts, pop-ups, and third-party API calls are blocked. ## Custom HTML editor configurations You can leverage the HTML format to customize the way your data appears in the labeling interface. Here are some common configurations our customers set up using the HTML editor. ### Compare and rank products Add CSS to your HTML file to display objects side-by-side in a grid container. This is helpful for ranking or comparing multiple products in the labeling interface. To do this, ensure your CSS display property is set to `grid`. # Hubstaff timer Source: https://docs.labelbox.com/docs/hubstaff-timer This guide explains how to configure your Hubstaff app for labeling. Use the instructions below as a guide for navigating the Hubstaff timer for your labeling tasks. ## Step 1: Create or join your Hubstaff account Follow these steps to set up your Hubstaff account and install the desktop app. 1. You should have received an email containing an invitation to Hubstaff. Click the link in the email to create your Hubstaff account (You can also set up your Hubstaff account via the onboarding checklist in your Alignerr app dashboard). 2. If you already have a Hubstaff account, the Labelbox team will add you to our organization. You'll receive an email inviting you to join the Alignerr organization in Hubstaff. ## Step 2: Install the Hubstaff desktop app 1. Download and install the Hubstaff desktop app on your computer. 2. Sign in using the credentials from the invitation email. ## Step 3: Open your project in Labelbox 1. Go to your assigned project in Labelbox. 2. Review the labeling instructions. 3. In the project overview tab, click the **Start timer** button. This will open the Hubstaff browser extension (you will need to install the Hubstaff browser extension, if you haven't already). 4. In the pop-up window, select **Always allow** and click **Open Hubstaff**. 5. When the Labelbox app has detected that the Hubstaff timer is enabled, you will be able to start labeling ## Step 4: Start labeling 1. Click **Start working** to open the labeling task. ### Case 1: Labeling in an external platform 1. When you click **Start working**, you may be directed to another platform to complete your labeling task. 2. Labelbox will automatically detect that you have opened the labeling task and will start the Hubstaff timer. 3. Complete the labeling tasks. 4. When you are done labeling, return to the Labelbox app and click **Stop timer**. ### Case 2: Labeling in the Labelbox platform 1. When you click **Start working**, you may be directed to the Labelbox editor to complete the Labeling task. 2. Before you can begin labeling, you’ll be prompted to acknowledge that two timers are running (Labelbox & Hubstaff) and that all work will be paid according to the Labelbox timer. 3. If the Hubstaff timer is not running, the submit button in the Labelbox app will be disabled. You will see a prompt asking you to confirm when the Hubstaff timer is enabled. 4. Complete the labeling tasks. 5. When you are done labeling, you do not need to explicitly start or stop any timer if you are labeling in the Labelbox editor. The app will start and stop your timer automatically. 6. When you have completed the task, submit your work in the Labelbox editor. # Images Source: https://docs.labelbox.com/docs/image-editor Guide for labeling image data. The image editor lets you label images with various supported annotation types, attachments, and editor settings. ## Set up image annotation projects To set up an image annotation project: 1. [Create an image dataset](/docs/datasets-datarows) and load the images you want to label. Alternatively, use the SDK to [import images](/reference/image). 2. On the [Annotate projects page](https://app.labelbox.com/projects), click the **+ New project** button. 3. Select **Image**. Add a **Name** and an optional **Description** for your project. 4. Click **Save**. The system then creates the project and redirects you to the project overview page. 5. Click **Add data**. Then select your image dataset. Click **Sample** to sample your dataset, or you can manually select data rows and click **Queue batch**. ### Data row size limit To view the maximum size allowed for a data row, see [limits](/docs/limits). ## Set up an ontology After setting up an audio annotation project, you can [add an ontology](/docs/labelbox-ontology#create-a-new-ontology) based on how you want to label the data. The image editor supports the following annotation types that you can include in your ontology: | Feature | Import annotation | Export annotation | | --------------------------------- | ---------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- | | **Bounding box** | [See payload](/reference/import-image-annotations#bounding-box) | [See payload](/reference/export-image-annotations#bounding-box) | | **Segmentation mask** | [See payload](/reference/import-image-annotations#segmentation-mask) | [See payload](/reference/export-image-annotations#mask) | | **Polygon** | [See payload](/reference/import-image-annotations#polygon) | [See payload](/reference/export-image-annotations#polygon) | | **Polyline** | [See payload](/reference/import-image-annotations#polyline) | [See payload](/reference/export-image-annotations#polyline) | | **Point** | [See payload](/reference/import-image-annotations#point) | [See payload](/reference/export-image-annotations#point) | | **Relationship** | [See payload](/reference/import-image-annotations#relationship-with-bounding-box) | [See payload](/reference/export-image-annotations#relationship) | | **Radio classification** | [See payload](/reference/import-image-annotations#classification-radio-single-choice) | [See payload](/reference/export-image-annotations#classification---radio) | | **Checklist classification** | [See payload](/reference/import-image-annotations#classification-checklist-multi-choice) | [See payload](/reference/export-image-annotations#classification---checklist) | | **Free-form text classification** | [See payload](/reference/import-image-annotations#classification-free-form-text) | [See payload](/reference/export-image-annotations#classification---free-form-text) | ### Classification scopes Classification annotations can be applied globally or nested within an object-type annotation. ### Audio-to-text for text classification You can enable free-text classification to allow labelers to record audio instead of typing text manually. The system transcribes the recorded audio into text using the integrated OpenAI Whisper model. To set up and use audio-to-text for text classification: 1. When [creating the ontology](/docs/labelbox-ontology#step-1-create), add a **Text** classification feature and enable **Record audio for transcription**. 2. In the editor, select the text classification annotation and click **Record audio**. Each recording can last up to 30 minutes and automatically stops when it reaches the limit. 3. Click **Stop recording** when you finish speaking. The system transcribes the audio into text and adds the recording as an attachment to the annotation. 4. Edit the transcribed text, remove the attached audio file, or click **Record audio** to re-record if needed. 5. Click **Submit** to save the annotation. ### Bounding box Create a bounding box by starting at one corner and dragging your cursor to create the shape around an object in the image. You can also click and drag to reposition the bounding box on the image. ### Cuboid Create a cuboid by starting at one corner and dragging your cursor to create a box shape around an object in the image. Once you release it, it will automatically become a cuboid. Using the cuboid tool You can use the various levers on the tool to adjust its rotation along the x, y, and z axes. Along the center of the top bar, you will find buttons to switch to **Rotate** mode,**Move** mode, or **Scale** mode. You can also directly input the rotation into the modal that appears in the ontology pane once the cuboid is initially drawn or selected for editing. The fields available to edit are: * **Scale:** This controls the cuboid size in each dimension. The unit is in pixels. * **Rotate:** This controls the rotation of the cuboid along the three axes. For more details on the rotation, please refer to the diagram below. ### Segmentation mask The segmentation mask annotation supports different tools based on the ontology creation time. ### Empty mask error If you receive an error message that reads, “The object is missing selected pixels. You should relabel or delete this object”, it means that the mask is considered empty. By definition, a segmentation mask cannot have overlapping pixels. Thus, drawing a second mask fully within the first creates an empty one, resulting in this warning message. #### AutoSegment 2.0 Ontologies created after June 13, 2023, by default, use a new *AutoSegment 2.0* tool powered by Meta's Segment Anything model. You can use the AutoSegment 2.0 tool to generate mask predictions for individual objects in your image. The Auto Segment 2.0 tool is designed for instance segmentation, meaning you can quickly label individual objects as it predicts one class at a time. Adjusting the contrast and brightness of the image **does not** affect the accuracy of the SAM model. When selecting a segmentation tool, choose the AutoSegment option from the top navigation bar. There are two modes available: * **Box mode**: allows you to draw a box around an object to generate a mask on it. * **Hover and Click mode**: allows you to visualize masks as you move the cursor around, click to generate, and further modify by adding positive and negative points. #### AutoSegment (Legacy) Only ontologies created before June 13, 2023 use this is the older version of AutoSegment. When a segmentation tool is selected, choose the AutoSegment option from the top navigation bar, and draw a box around the object you want to label. For best results, draw the boundary of the box close to the object edges. You can also adjust the brightness and contrast settings in the **Adjustments** menu in the top navigation bar to further improve results. Creating more contrast between the object you are labeling and its background will help the model detect the object more accurately. #### Brush tool Ontologies created after June 13, 2023 can access the brush tool. Use the brush tool to draw a freehand mask as though you are painting the canvas. You can choose from a circle or square-shaped brush and adjust the size in pixels. Click the brush icon with the minus sign to use the eraser tool. It behaves similarly but removes mask pixels as you move the mouse. #### Pen tool Use the pen tool to outline the item in the image. Hold the cursor down to draw freehand or let the cursor go to draw straight lines between points. The pen tool is only available when creating segmentation annotations. Click the pen icon with the plus sign to use the pen tool. #### Erase tool You can use the erase tool to clean up the edges of a segmentation mask. Click the pen icon with the minus sign to use the erase tool. #### Fill tool You can use the fill tool to easily label backgrounds or, otherwise, assign a segmentation annotation to all other pixels in the image that have not already been labeled. The fill tool is only enabled for segmentation annotations. Click the droplet icon to use the fill tool. ### Polygon Create a polygon annotation by clicking to create each point in the shape. Click the first point to close the polygon. ### Polyline Use the polyline tool to label lines in an image. Click on the last point to complete the shape. ### Point Use the point tool to label precise locations on the image. ### Relationship To create a relationship between annotations, select a relationship tool and hover over the annotation where you want the relationship to start to reveal the annotation's anchor points. Click an anchor point to create the starting point of the relationship, then bring your mouse over to the annotation you want to relate it to, hovering over it to reveal its anchor points. Finally, click one of the anchor points to complete the relationship. Right-click a relationship to change its direction, make it bi-directional, or delete it from the asset. ## Editor settings The image editor has the following settings for object-type annotations: * **Overlay object titles**: Displays the names of annotation objects on the image. * **Show annotations**: Displays created annotation objects on the image. * **Enable annotations occlusion**: Allows annotation objects to overlap based on z-order (front objects occlude those behind them). * **Polygon snapping**: Allows points on lines, polygons, and bounding boxes to align or attach to polygon edges. ### Setting object positions When the **Enable annotations occlusion** setting is active, you can adjust the z-order of annotation objects to control how they stack visually. The **OBJECTS** panel provides two modes for managing object positions: * **Universal arrangement**: Adjust the stacking order incrementally by selecting and repositioning individual objects. * **List-based arrangement**: Use a drag-and-drop interface to reorder objects directly in a list. ### Z-order settings limit All object-type annotations have a z-order value that you can [export](/reference/export-image-annotations#sample-project-export), but you can't change the z-order value for cuboids and segmentation masks because they can't occlude other objects. # Connect AWS S3 to Labelbox via IAM Delegated Access Source: https://docs.labelbox.com/docs/import-aws-s3-data Step-by-step guide for importing your S3 bucket data to Labelbox via IAM delegated access. This guide provides a complete set of step-by-step instructions for securely connecting your Amazon S3 bucket data to Labelbox using IAM delegated access. ## Prerequisites Before you begin, ensure you have the following: * You have permissions to create IAM roles and policies in your AWS account. * You know the name of the S3 bucket you want to connect. * You have configured [Cross-Origin Resource Sharing (CORS)](/docs/create-cors-headers) on your S3 bucket to allow Labelbox to request resources from your cloud storage. ## Step 1: Begin integration in Labelbox First, you'll start the integration process in the Labelbox UI to obtain the necessary credentials. 1. In Labelbox, navigate to **Settings** > **Integrations**. 2. Under **Add integrations**, select **Sync from a source**. 3. Select **AWS** as your source. 4. From the **Create AWS integration page**, copy the Labelbox AWS account ID and the External ID. 5. Leave this page open in your browser: you will return to it in a later step. ## Step 2: Create a role for Labelbox in AWS Next, you will create a role in your AWS account that Labelbox can assume to access your S3 bucket. ### Part A: Create a permission policy 1. In your AWS account, navigate to the **IAM Management Console > Policies** page. 2. Click **Create policy** and select the **JSON** policy editor. 3. Paste the following JSON policy, which grants read-only access to a specific S3 bucket. ```json theme={null} { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:ListBucket", "s3:GetBucketLocation" ], "Resource": [ "arn:aws:s3:::CustomerBucketARN/*", "arn:aws:s3:::CustomerBucketARN" ] } ] } ``` Remember to replace `CustomerBucketARN`with the actual ARN of your S3 bucket. 4. Add Add a name for the policy (for example, `LabelboxReadAccess`) and click **Create policy**. ### Part B: Create a role 1. From the **Roles** page in the IAM Management Console, click **Create role**. 2. Select **Custom trust policy** and paste the following policy. Be sure to enter the **External ID** you obtained from Labelbox in Step 1. ```json theme={null} { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::340636424752:role/lb-aws-delegated-access-role" }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "sts:ExternalId": "" } } } ] } ``` 3. In the **Add Permissions** step, attach the permission policy you created in Part A (e.g., `LabelboxReadAccess`). 4. Add a name for the role (e.g., `LabelboxS3Access`) and click **Create role**. 5. Click on the role you just created and copy the **Role ARN** from the **Summary** tab. ## Step 3: Complete the integration setup in Labelbox Now, you will add the Role ARN to the new integration you added in Labelbox in Step 1. 1. Go back to the **Create AWS integration** page in Labelbox. 2. In the **Provider ARN and name** section: * Set the **integration name**. * Enter the **AWS bucket name**. * Paste the **AWS Role ARN**. 3. Click **Save integration**. ## Step 4: Validate the integration After you complete the setup in Labelbox, the system will automatically run a validation check on the integration. You can check the status on the **Integrations > Manage integrations** page. If the integration fails, you can click the refresh icon to view error messages and troubleshoot your setup. Here are possible error messages and our suggestions for troubleshooting your integration setup. | Error | Troubleshooting | | :---------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | *Role cannot be assumed* | Ensure that the integration’s role ARN is correct and that the Labelbox External ID is properly configured in your AWS account. Additionally, your AWS account admin must [activate STS](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_enable-regions.html) in the `us-east-2` region using the IAM console | | *External ID configured insecurely* | Ensure that the Labelbox External ID is properly configured in your AWS account. | ## Step 5: Create, upload, and validate the dataset Finally, you need to create and validate your dataset. 1. When creating your import file, use virtual-hosted-style URLs that follow this format: `https://.s3..amazonaws.com/`. To learn how to format your import file, visit these guides: 2. If you created your integration and imported your dataset using the Labelbox UI, Labelbox automatically runs validation checks to determine whether the CORS setup was configured properly. It also checks if Labelbox can successfully fetch data from your S3 bucket and properly sign the URLs. Your dataset should now be set up with IAM delegated access. Labelbox will use the AWS role you created to generate temporary signed URLs every time it accesses data in your S3 bucket. # Import errors & warnings Source: https://docs.labelbox.com/docs/import-errors-and-warnings Learn about the warnings and errors you may encounter when importing a dataset to Labelbox. This guide explains the processing issues you may encounter when importing data and how to resolve them. When data is ingested, Labelbox processes it to generate embeddings, extract media attributes, and standardize formats. If an issue occurs, it will be flagged as an error or a warning. ## Processing states * **Processing**: The initial state when a data row is ingested. * **Success**: The data row was processed successfully. * **Failure**: The data row could not be processed due to an error or warning. It will not appear in the Catalog until the issue is fixed. ## View and fix failed data rows 1. **Find failures**: In Catalog, an issue icon will appear next to datasets with processing issues. Click this icon to go to the "Processing Issues" view. 2. **Filter issues**: You can filter the view to see data rows with specific errors or warnings. 3. **Fix the source issue**: Use the error descriptions below to diagnose the problem (e.g., fix a broken URL, correct cloud storage permissions). 4. **Re-process**: Once you've fixed the underlying issue, select the failed data rows in the "Processing Issues" view and click **Re-process**. ### Large files Large files such as TIF, Video, and Audio can take a few seconds to process. ## Common errors (failures) An error is a serious issue that prevents a data row from being used. | Error | Description & action | | ---------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | | ConversionFailed | The file could not be converted to a format required for labeling.
**Action:** Ensure the file is not corrupt and meets format requirements. | | FetchFailed | The data row URL could not be reached.
**Action:** Check that the URL is correct and that the Labelbox backend has access. | | FetchTimeout | The data row could not be fetched due to a timeout error.
**Action**: Re-process the data rows. | | Forbidden | Access to the URL was denied.
**Action:** Check your cloud storage permissions or pre-signed URL validity. | | InternalError | An internal error prevented the data row from being processed.
**Action**: Re-process the data rows. | | InvalidEspg | The data row has invalid ESPG metadata. | | NotFound | The file does not exist at the specified URL.
**Action:** Verify the URL is correct. | | TooManyBands | The geospatial data row has too many bands. | | TooManyRequests | The data row could not be fetched due to a rate-limiting error. | ## Warnings A warning indicates a non-critical issue. The data row can be used, but your experience may be degraded. | Warning | Description & action | | ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | InvalidCors | The file is served with invalid CORS headers, which may impact the labeling experience.
**Action:** Check that the URL is being served with the correct CORS headers. | **Processing issues and the Python SDK** * If you export data rows from Catalog using the SDK, all data rows are exported, including those with processing issues. * If you send data rows to a labeling project using the SDK, all data rows are sent to the labeling project, including those with processing issues. # Import ground truth annotations into a project Source: https://docs.labelbox.com/docs/import-ground-truth Learn how to import your ground truth data from internal or third-party tools into Labelbox. This guide will walk you through the process of importing your existing ground truth annotations from other internal or third-party tools into Labelbox. ## What is ground truth? In machine learning, "ground truth" refers to the data that we consider to be the "true" or accurate labels for a given dataset. This data is often created by human annotators and is used to train and evaluate machine learning models. By importing your ground truth annotations into Labelbox, you can consolidate all of your labeled data in one place, creating a single source of truth for your machine learning projects. This is particularly useful when you are migrating from another labeling platform to Labelbox. ## How to import your ground truth annotations? You can generate Python SDK code snippets directly from your project's "Import labels" tab to help you with the import process. These snippets will serve as a great starting point for the steps below. 1. **Import your data rows**: Before you can import your annotations, you need to have the corresponding data rows (the raw data you want to label, such as images, videos, or text) in Labelbox. If you haven't already, you'll need to [create a dataset](/docs/datasets-datarows) and import your data rows into the Catalog. 2. **Create or select an ontology**: An ontology (also known as a taxonomy) defines the set of classes and attributes that can be used to label your data. For example, an ontology for an image classification project might include classes like "cat," "dog," and "car." You'll need to either [create a new ontology](/docs/labelbox-ontology) or select an existing one that matches the annotations you want to import. 3. **Create a labeing project**: Your annotations need to be associated with a labeling project in Labelbox. If you don't already have a project set up with the correct ontology, you will need to [create a project](/docs/create-a-project). 4. **Send a batch of data rows to your project**: Now that your project and ontology are configured, you need to [send a batch of data rows](/docs/batches) to the project's labeling queue. This will be the specific set of data rows that you will be attaching your imported annotations to. 5. **Create the annotations payload**: This is where you'll prepare the annotations themselves for import. Each annotation in your payload needs to reference a specific annotation class from your ontology and a specific data row ID. Labelbox supports two formats for the annotation payload: **Python Annotation Types** and **NDJSON**. Go to the **Import labels** tab and select **Import ground truth**. There you will find a sample code snippet you can use to import your ground truth annotations. 6. **Import your annotation payload**: Once you have created your annotation payload, the final step is to submit the import job. After the job is complete, your ground truth annotations will appear in your project, and the corresponding data rows will be marked as "Done." Under **Import labels** → **Import ground truth**, copy the code snippet for **Import annotations payload**. Check out these end-to-end developer guides to learn how to import ground truth annotations based on your data type. ## Best practices * Before starting a new import job, ensure that there are no existing model-assisted labeling (MAL) annotations on the data rows. Duplicate import jobs can overwrite existing labels or cause unexpected behavior. * The activity page in Labelbox will not show any changes until the entire import job is complete. # Import annotations as pre-labels into a project Source: https://docs.labelbox.com/docs/import-prelabels Learn how to import your annotations as pre-labels into Labelbox. This guide will walk you through the process of importing your annotations from other internal or third-party tools as pre-labels into Labelbox. ## What are pre-labels? Pre-labels are suggested annotations that are programmatically imported into a Labelbox project. Often generated by a machine learning model, these annotations serve as a starting point to accelerate the labeling process. Instead of creating every annotation from scratch, human labelers are presented with these pre-populated suggestions in the labeling editor. The primary goal of using pre-labels is to boost labeling efficiency. A labeler's task shifts from manual creation to a faster workflow of reviewing, correcting, and confirming the suggested annotations. A pre-label is only converted into a finalized, ground truth annotation after a human labeler has reviewed and submitted the asset, ensuring that the final data maintains a high standard of quality. ## How to import annotations as pre-labels You can reference Python SDK code snippets directly from the **Import labels** tab in your project dashboard. These snippets will serve as a great starting point for the steps below. 1. **Import your data rows**: To import annotations as pre-labels, you'll need to have a set of data rows to attach the annotations to. If you do not already have a set of data rows, you'll need to [create a dataset](/docs/datasets-datarows) by importing data rows into Catalog. 2. **Create/select an ontology**: When you import a set of annotations, you'll need to specify the ontology (also called taxonomy) that corresponds to the set of annotations. If the project ontology already exists in Labelbox, you may select the ontology that fits your annotations. If the ontology does not exist in Labelbox yet, you'll need to [create an ontology](/docs/labelbox-ontology). 3. **Create a labeling project**: Before you can import your annotations, you'll need to make sure you have a project to connect these annotations to. You cannot simply import annotations without specifying which project they'll be associated with. Oftentimes, you will already have a project set up with the correct ontology for your set of annotations. However, If you do not already have a project, you will need to [create a project](/docs/create-a-project) and attach the ontology that fits your annotations. 4. **Send a batch of data rows to the project**: Now that you have your project and ontology configured, you'll need to send a subset of data Rows (i.e., a [batch](/docs/batches)) to the project's labeling queue. You will be attaching your annotations to this batch of data rows. 5. **Create the annotations payload**: This is where you’ll prepare the annotations themselves for import. Each annotation in your payload needs to reference a specific annotation class from your ontology and a specific data row ID. Labelbox supports two formats for the annotation payload: **Python Annotation Types** and **NDJSON**. Go to the **Import labels** tab and select **Import pre-labels**. There you will find a sample code snippet you can use to import your ground truth annotations. 6. **Import your annotation payload**: Once you have created your annotation payload, the final step is to submit the import job. After the job is complete, your ground truth annotations will appear in your project, and the corresponding data rows will be marked as “Done.” Under **Import labels** → **Import pre-labels**, copy the code snippet for **Import annotations payload**. Check out these end-to-end developer guides to learn how to import ground truth annotations based on your data type. ## Best practices * Make sure the annotations are in the proper format. Use the charts above to determine whether an annotation type is supported for the data type you are labeling. * Before you begin a new import job, make sure there are no existing MAL annotations on the data row. * To override an imported annotation on a data row, import the annotation again using the same `uuid`. # Improve model performance Source: https://docs.labelbox.com/docs/improve-model-performance An overview of strategies to improve model performance using Labelbox. Achieving high model performance is an iterative process of training, analyzing, and refining. It's not just about training a model once, but about creating a continuous loop of improvement. This involves two key strategies: * First, a deep dive into your model's results to understand where and why it's making mistakes, which includes identifying both model weaknesses and errors in your ground truth labels. * Second, it involves being strategic about which data you choose to label next. By focusing your labeling efforts on the data that will provide the most value, you can improve your model's performance more efficiently. ## Find and fix your model's errors Model error analysis is the process of investigating your model's incorrect predictions to understand the root cause of its failures. Models rarely fail randomly; they often struggle with specific, predictable patterns. By identifying these patterns, you can move beyond simple accuracy scores and take targeted actions. This might involve sourcing more data that captures these edge cases, giving your model the specific examples it needs to learn and improve. ### How to find model errors 1. **Select a model run:** Go to the **Models** tab and select the model and specific model run you want to analyze. 2. **Filter for prediction disagreements:** * In the filter panel on the left, click **+ Add filter**. * Select the `iou` (Intersection over Union) metric. * Set the condition to **\<** (less than) a low value, such as **0.5**. This will isolate data rows where the model's predicted annotation has poor spatial overlap with the ground truth label. 3. **Sort by lowest confidence:** * At the top of the data row gallery, find the **Sort by** dropdown menu. * Select **Confidence** and choose the **Ascending** order (arrow pointing up). This brings the predictions your model is least certain about to the front. 4. **Inspect and identify patterns:** * Click on the data rows at the top of the sorted list to open the detailed view. * Visually inspect the image and the annotations. Look for common characteristics among the errors. For example, you may notice the model consistently fails on blurry images, in low-light conditions, or with objects at unusual angles. 5. **Use the metrics and projector views for high-level insights:** * Click on the **Metrics** tab within the model run to see a class-by-class breakdown of precision and recall. This can quickly point you to which object classes are causing the most problems. * Click on the **Projector** tab to see a 2D or 3D visualization of your data's embeddings. Look for areas where the color-coded clusters for different classes overlap, as this indicates the model is "confusing" them. ### How to fix model errors 1. \*\*Select the relevant data rows: \*\* * While in the model run's gallery view, hover over the data rows that represent a failure pattern and click the checkbox in the top-left corner of each thumbnail. 2. **Find visually similar data:** * Once you have a few examples selected, a menu will appear at the bottom of the screen. * Click the **Find similar** button (which looks like a magic wand). This will take you to **Catalog** and automatically search for other data in your dataset that is visually similar to your selection. 3. **Create a batch and send for labeling:** * From the "similar data" search results in Catalog, select the unlabeled data rows you want to add to your training set. * Click the **+ Add to batch** button at the bottom of the screen. * Give the batch a descriptive name (e.g., "Low-light failure cases"). * Navigate to your labeling project, select this batch, and assign it to your labelers. ## Find and fix your labeling mistakes The quality of your training data is the single most important factor in your model's performance. Labeling mistakes can silently degrade performance, but your model itself can be a powerful tool for finding them. When a high-confidence model prediction contradicts a ground truth label, it's a strong signal that the human labeler may have made a mistake. ### How to find labeling mistakes 1. \*\*Filter for prediction disagreements: \*\*Just as before, navigate to your model run and add a filter for `iou < 0.5` to find where predictions and ground truth labels do not align. 2. **Sort by highest confidence:** * In the **Sort by** dropdown menu, select **Confidence**. * This time, choose **Descending** order (arrow pointing down). * This surfaces data rows where the model was very confident (`> 0.9`) but its prediction still disagreed with the ground truth—a strong indicator of a potential labeling error. 3. **Inspect the data:** Click on the top results to inspect them. If the model's prediction looks correct and the ground truth label is clearly wrong, you've found a labeling mistake. 4. Use the projector view to spot outliers: * Navigate to the **Projector** tab. * In the display options, choose to **Color by: Feature schema** (your ground truth classes). * Look for "islands" of a single color within a larger cluster of a different color. For example, a single blue dot (labeled 'car') in the middle of a large red cluster (labeled 'truck') is very likely a mislabeled data point. Click on the dot to view the data row directly. ### How to fix labeling mistakes 1. **Select data rows with confirmed mistakes**: In the gallery view of your model run, use the checkboxes to select all the data rows that you have confirmed contain labeling errors. 2. **Open the selection in Catalog**: In the menu at the bottom of the screen, click **Open in Catalog**. 3. **Tag for re-labeling**: * With the items still selected in Catalog, go to the **Metadata** panel on the right. * Click the **+** icon to add a new tag. * Create and apply a tag named `re-label` or `rework`. 4. **Filter and send to the rework step in your project**: * Go to your labeling project and navigate to the **Data Rows** tab. * Create a filter for `Metadata`, and select the `re-label` tag you just created. * Select all the filtered data rows. * Click the **Send to Rework** button. This will put the data back into the labeling queue for correction. ## Prioritize high-value data to label (active learning) Labeling data is often the most expensive and time-consuming part of any machine learning project. Active learning is a strategy that helps you maximize your return on investment by intelligently selecting which data to label next. Instead of labeling data at random, you focus your efforts on the examples that your model finds most confusing. The intuition is simple: the model learns more from data it is uncertain about than from data it can already predict with high confidence. ### How to perform uncertainty sampling 1. **Generate predictions on unlabeled data**: Use your currently trained model to generate predictions and confidence scores on a large pool of your unlabeled data. 2. **Upload predictions to a new model run**: * Go to the **Models** tab and select your model. * Click **+ New model run**. * Follow the instructions to upload your unlabeled data rows along with the corresponding model predictions and confidence scores in the specified format. 3. **Sort by lowest confidence**: * Once the model run is created, open it. * Go to the **Sort by** dropdown menu at the top of the gallery. * Select **Confidence** and choose **Ascending** order. This will surface the data your model is most uncertain about. 4. **Select and batch the most uncertain data**: * Select the top `N` data rows from the sorted list by using the checkboxes. This `N` is your budget for how much new data you want to label. * From the menu at the bottom, click **+ Add to batch**. * Give the batch a descriptive name, like "Uncertainty sampling batch 1". 5. **Assign the batch for labeling**: * Navigate to your labeling project. * Find the new batch you just created, select it, and assign it to your labelers. 6. **Retrain your model**: After this high-value data has been labeled, add it to your training dataset and retrain your model. This targeted approach should yield a more significant performance improvement than randomly sampling data. # Labeling instructions Source: https://docs.labelbox.com/docs/instructions-and-quizzes Your instructions are the single most important resource for your labeling team. This is where you move beyond simple definitions and provide the context, examples, and rules needed to handle real-world complexity. Well-crafted instructions are the first line of defense against inconsistent or inaccurate annotations. ## Create labeling instructions To add labeling instructions: 1. Go to your project overview. 2. Click on **Labeling instructions**. 3. In the **Instructions** section, you may write the instructions directly in the text box, upload a PDF or HTML file, or add a video link. What to include in your instructions: * **Detailed Definitions:** Go beyond the feature name. What exactly constitutes a "partially occluded vehicle" versus a "fully visible vehicle"? * **Visual Examples (The Good and The Bad):** Show clear examples of correctly labeled data. Just as importantly, show examples of common mistakes or edge cases that should be labeled differently or ignored. * **Edge Case Guidance:** Your data will never be perfect. Provide rules for how to handle blurry images, rare objects, or situations where multiple interpretations are possible. * **"When in Doubt" Rules:** Give your team a clear default action to take when they are unsure, such as flagging the asset for review. Use the following template as a reference (or download this [PDF file](https://storage.googleapis.com/labelbox-sample-datasets/Docs/Labelbox_labeling_instructions_template_v4.pdf)).