See IAM delegated access Integrations Guide page for instructions to set up Cloud storage integrations. Using IAM delegated access integrations is the recommended option for all cloud users.
List your organization's IAM integrations
from labelbox import Client
client = Client("<YOUR_API_KEY>")
organization = client.get_organization()
iam_integrations = organization.get_iam_integrations()
for integration in iam_integrations:
print(integration)
print("Default IAM integration:", organization.get_default_iam_integration())
Set IAM integration for Datasets
Each dataset can have only one IAM integration. Use the iam_integration
optional field for client.create_dataset
. If not set, it will use the default integration of your organization.
You can then upload data rows with the cloud storage URLs.
iam_integration = organization.get_iam_integrations()[1]
dataset = client.create_dataset(name="IAM manual demo", iam_integration=iam_integration)
Override default integration
You can override default integration when creating a dataset.
dataset = client.create_dataset(name="IAM manual demo", iam_integration=None)
Upload data rows with delegated access
Make sure the type of IAM integration is matching your data rows' cloud storage, then simply use the URL for the row_data
field to upload data rows.
# Some examples:
datarows = [{"row_data": "https://<bucket-name>.s3.<region>.amazonaws.com/<key>"}] # Amazon S3
datarows = [{"row_data": "gs://gcs-lb-demo-bucket/test.png"}] # Google Cloud Storage
datarows = [{"row_data": "https://labelboxdatasets.blob.core.windows.net/datasets/geospatial/001.jpg"}] # Microsoft Azure Blob Storage
task1 = dataset.create_data_rows(datarows)
task1.wait_till_done()
Select an integration in the SDK
When creating a dataset via the SDK, the create_dataset
method has an optional iam_integration
parameter that can be used to specify the desired integration. Sample code for viewing and selecting integrations, along with creating a dataset using this parameter, is shown in the Common SDK methods section below.
If no argument is provided to the optional
iam_integration
parameter in thecreate_dataset
method, then the default integration will automatically be used.
Common SDK methods
#!pip install labelbox
import labelbox
organization = client.get_organization()
print(organization.get_iam_integrations())
print(organization.get_default_iam_integration())
iam_integration = organization.get_iam_integrations()[1]
dataset = client.create_dataset(name="IAM manual demo", iam_integration=iam_integration)
task1 = dataset.create_data_rows(datarows)
task1.wait_till_done()