Amazon S3

Import your S3 bucket data via IAM Delegated Access.

When you use IAM delegated access to add your unlabeled data to Labelbox, you can keep your assets in AWS and configure Identity and Access Management (IAM) roles and policies to grant Labelbox read-only access to your S3 buckets.

Part 1: Open new integration in Labelbox

Start by logging into Labelbox and creating a new IAM Integration.

Log into Labelbox, go to Account > Integrations, and click New Integration. Copy the Labelbox account ID and external ID. Leave this open as you will come back to it later.

Part 2: Create a role for Labelbox in AWS

Next, you'll need to create a role for Labelbox in AWS account, specify permissions, and select a bucket. Follow the steps below to set this up in your AWS account.

  1. Go to your AWS account and set up CORS for your bucket (CORS allows Labelbox to request resources from your cloud storage). See Create CORS headers to learn how to set up CORS for your bucket.

  2. In your AWS account, create a permission policy for your bucket. If you already have a permission policy you plan to use, proceed to step 7. In your IAM Management Console, go to the Policies section, click Create policy, and enter your policy in the JSON tab. This sample policy restricts access to a specific S3 bucket.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": "arn:aws:s3:::CustomerBucketARN/*"
        }
    ]
}

Element

Description

Effect

Specifies that the elements included in the statement are allowed.

Action

Describes the specific action(s) that will be allowed. Setting this to s3:GetObject gives Labelbox read-only access to the bucket you specify. See IAM JSON policy elements: Action to learn more.

Resource

Specifies the object(s) that the statement covers. This is where you specify your Bucket ARN. To find your Bucket ARN, go to your s3 console, select the bucket from the list, go to the Properties tab, and copy the Amazon Resource Name (ARN). The * at the end of the example ARN above is a wildcard character. See IAM JSON policy elements: Resource to learn more.

  1. Click Next: Review to bypass the optional Add tags step. Tags are not required to set up this integration.

  2. In the Review policy step, name the policy you just created. We recommend naming it something like LabelboxReadAccess.

  3. To approve, click Create policy.

  4. From the Roles page, follow these steps:

    a. Click Create role.

    b. Select Another AWS account.

    c. Paste the Labelbox Account ID from step 1.

    d. Check the box for Require external ID.

    e. Paste the Labelbox External ID from step 1.

    f. Do not check the box for Require MFA.

    g. Click Next: Permissions.

  5. In the Attach permissions policies section, check the box next to the permission policy you created to attach it to your role. Or you can select a policy in the list provided (e.g., AmazonS3ReadOnlyAccess).

  6. Click Next: Tags.

  7. Click Next: Review to bypass the optional Add tags step. Tags are not required to set up this integration.

  8. Name the role you created for Labelbox. We recommend naming it something like LabelboxS3Access.

  9. When you are done reviewing, click Create role.

  10. Click on the role you just created and copy the Role ARN at the top of the Summary tab.

Part 3: Complete integration setup in Labelbox

Add the Role ARN to the new integration you opened in Labelbox in Part 1.

Go back to the Integrations tab in Labelbox and paste the AWS Role ARN in the provided field. Then, name the integration.

Part 4: Validate the integration

Next, you'll need to make sure the validation was set up correctly.

If you completed Parts 1 & 3 via the Labelbox UI, Labelbox will automatically run a validation check on the integration setup for you. You can check by going to the Integrations tab and checking the Last checked column indicates whether the integration was successful. If the integration failed, click on the refresh icon to view the error messages.

Here are the possible error messages and our suggestions for troubleshooting your integration setup.

Error

Troubleshooting

Role cannot be assumed

Ensure that the integration’s role ARN is correct and that the Labelbox External ID is properly configured in your AWS account.

External ID configured insecurely

Ensure that the Labelbox External ID is properly configured in your AWS account.

Part 5: Upload data

Delegated Access for AWS supports “virtual-hosted-style” URLs; they follow this format:

https://<bucket-name>.s3.<region>.amazonaws.com/<key>

Click the recipe to learn how to use the SDK to import data for labeling.

Part 6: Validate the dataset

Last, you will need to make sure your dataset was configured correctly.

If you created your integration and imported your dataset via the Labelbox UI, Labelbox will automatically run validation checks to determine whether the CORS setup was configured properly. It will also check whether Labelbox can successfully fetch data from your S3 bucket and if Labelbox can properly sign the URLs.

Your dataset should now be set up with IAM Delegated Access. Labelbox will use the AWS role you created to generate temporary signed URLs every time it accesses data in your S3 bucket.


Did this page help you?