IAM integration

See the IAM delegated access Integrations Guide page for instructions on setting up Cloud storage integrations. Using IAM delegated access integrations is the recommended option for all cloud users.

List your organization's IAM integrations

import labelbox as lb

client = lb.Client("<YOUR_API_KEY>")
organization = client.get_organization()
iam_integrations = organization.get_iam_integrations()

for integration in iam_integrations:
	print(integration)

Get the default IAM integration

default_integration = organization.get_default_iam_integration()

Set IAM integration during dataset creation

Each dataset can have only one IAM integration. Use the iam_integration optional field for client.create_dataset.

If not set, it will use the default integration of your organization.

You can then upload data rows using the cloud storage URLs.

iam_integration = organization.get_iam_integrations()[1] 
dataset = client.create_dataset(name="IAM manual demo", iam_integration=iam_integration)

Override default integration during dataset creation

You can override the default integration when creating a dataset.

dataset = client.create_dataset(name="IAM manual demo", iam_integration=None)

Update dataset integration

You can change the current integration of a dataset with add_iam_integration.

# Get all IAM integrations
iam_integrations = client.get_organization().get_iam_integrations()
 
 # Get IAM integration id
iam_integration_id = [integration.uid for integration
   in iam_integrations
   if integration.name == "My S3 integration"][0]

# Set IAM integration for integration id
dataset.add_iam_integration(iam_integration_id)


 # Get IAM integration object
iam_integration = [integration.uid for integration
   in iam_integrations
   if integration.name == "My S3 integration"][0]

# Set IAM integration for IAMIntegrtion object
dataset.add_iam_integration(iam_integration)

Remove/Unselect dataset integration

dataset.remove_iam_integration()

Upload data rows with delegated access

Make sure the type of IAM integration is matching your data rows' cloud storage, then use the URL for the row_data field to upload data rows.

# Some examples: 
datarows = [{"row_data": "https://<bucket-name>.s3.<region>.amazonaws.com/<key>"}] # Amazon S3
datarows = [{"row_data": "gs://gcs-lb-demo-bucket/test.png"}] # Google Cloud Storage
datarows = [{"row_data": "https://labelboxdatasets.blob.core.windows.net/datasets/geospatial/001.jpg"}] # Microsoft Azure Blob Storage

task1 = dataset.create_data_rows(datarows)
task1.wait_till_done()