To start labeling your data, you first need to grant our platform secure access to the files stored in your private cloud (AWS, GCP, or Azure). This guide explains the two methods for connecting your data, helping you choose the best one for your project’s security and workflow needs.The two methods are
IAM Delegated Access: A robust, long-term connection method.
Signed URLs: A flexible method using temporary, secure links to your data.
Our Recommendation: For most use cases, especially long-term projects, we recommend IAM Delegated Access for its superior security and lower maintenance.
Feature
IAM Delegated Access
Signed URLs
Setup complexity
High. Requires a one-time configuration within your cloud provider’s IAM console.
Low. No initial cloud configuration is needed in Labelbox.
Maintenance
Low. “Set it and forget it.” Works for all data in the configured location.
High. Requires a continuously running service on your end to generate new URLs.
Data Freshness
Real-time. New data added to your bucket is immediately available for labeling.
Delayed. New data requires new signed URLs to be generated and uploaded to Labelbox.
Ideal for
Long-term projects, enterprise-scale data operations, and stringent security environments.
Quick-start projects, proof-of-concepts, or when you cannot create IAM roles.
IAM (Identity and Access Management) Delegated Access is the most secure and scalable method for connecting your data. You create a trust relationship by setting up a dedicated role within your own cloud account that Labelbox is permitted to assume. This gives Labelbox temporary, read-only credentials to access your data when your users are labeling.
You: Create an IAM role in your AWS, GCP, or Azure account that has read-only permissions to your data bucket.
You: Provide Labelbox with the unique identifier (ARN/ID) of that role.
Labelbox: When a user needs to view an image or document, Labelbox uses the provided identifier to request temporary access credentials from your cloud provider.
Your Cloud Provider: Validates the request and grants Labelbox a short-lived token to access only the specified data.
A signed URL is a web link that provides temporary access to a specific file in your storage bucket. Each URL is “signed” with cryptographic keys that validate the request and expire after a set time (e.g., 7 days). You are responsible for generating these URLs and providing them to Labelbox.
Labelbox provides a Python SDK to help automate data setup.You can download sample code from the app or use the online docs to learn more.To download samples from the app:
From the dataset default screen, select Use Python SDK.
From the Create data rows prompt, select the tab appropriate for your data type.
Use the Copy button to copy the code to the Clipboard or the Download button to save it locally.
Once you have a copy of the sample script, you can customize for your needs.More information is available for each supported data type, including:
To upload local files directly to the Labelbox platform, go to Catalog, click +New, then select Choose files to upload.
Direct upload not recommendedUploading your files to Labelbox is NOT recommended. We recommend using IAM delegated access or Signed URLs instead (see sections above).