Google Cloud Storage
Learn how to import your GCS bucket data to Labelbox via IAM delegated access.
When you use IAM delegated access to add your unlabeled data to Labelbox, you can keep your assets in GCS and grant Labelbox read-only access to your Google Cloud buckets.
Part 1: Create GCP Integration in Labelbox
First, you will need to open a new integration in the Labelbox UI.
-
Navigate to the Integrations tab.
-
Click New integration.
-
Specify your GCS bucket name and click Save integration.
- Copy the Service Account Email ID and click Finish setup.
Part 2: Configure GCP bucket IAM permissions
Next, you will need to configure the settings in your GCP account for this integration.
Spaces in filenames can be problematic
For optimal performance, eliminate all spaces in filenames before uploading them to your Google Cloud bucket.
-
In your GCS account, navigate to your GCS bucket permissions.
-
Click Add permissions.
-
Paste the Service Account Email ID you copied in Part 1.
-
Select the Storage Object Viewer predefined role that grants read access to the bucket. Learn more here. Make sure that you are using the new roles only. It is a common mistake to select a legacy role with a similar name.
-
Click Save to finalize the permission settings.
Note
Only one bucket is supported per integration.
Part 3: Configure CORS
Follow these instructions to set up the appropriate CORS for the Google Cloud Storage bucket.
Part 4: Create & upload the dataset
Create and upload a JSON file containing sample data to Labelbox. Click through the links below to learn how to format your import file.
Data type | Supported |
---|---|
Images | Import specifications |
Video | Import specifications |
Text | Import specifications |
Tiled imagery (Slippy maps) | NOT SUPPORTED |
Tiled imagery (COG, NITF, GeoTIFF) | Import specifications |
Audio | Import specifications |
Document | Import specifications |
Conversation | Import specifications |
To upload a dataset via the UI, follow this steps in Create a dataset.
Part 5: Validate the integration
Finally, you will need to validate whether the integration was set up properly in the Labelbox UI.
-
Ensure that the Google integration you just created is set as the default or specified during dataset creation.
-
Create a project, attach the dataset, and open the dataset in the Labelbox Editor to validate that the integration is functioning properly.
If a dataset is signed by a GCP IAM integration, Labelbox will attempt to sign all data rows with this integration. The value of rowData
for each Data Row will be updated as follows:
<https://storage.googleapis.com/${bucket}/${key}?{queryParams}
>
The queryParams
contain signing information.
Only gsutil URIs are supported
Please ensure that you are using
gsutil
URIs during data import (JSON file or Python SDK).Example
gsutil
URI:gs://gcs-lb-demo-bucket/test.png
Updated 3 months ago