Connecting GitLab as an AWS Identity Provider and Improving CI/CD and Security

In a previous blog post we documented our first attempt. Having learnt a bit more, this updated blog post describes our latest setup.

Configuring GitLab as an Identity Provider (IdP) for AWS

Our GitLab instance was previously running on a non-standard port which prevented us from using it as an IdP for AWS. We have since moved to using the standard HTTPS port and setup the IdP connection.

The process was pretty straight-forward:

  • In AWS, open IAM > Identity Providers.
  • Select Add provider.
  • Select OpenID Connect as the Provider type.
  • Enter the GitLab instance root URL in Provider URL and Audience.

This didn’t work on first attempt- I think we were presented with the error:

Identity provider was not added.
Could not connect to https://gitlab.domain.tld

Fortunately the GitLab docs for configuring OpenID connect in AWS have documented some common troubleshooting steps. A common problem arises when the GitLab instance’s certificate chain is not in the correct order.

After fixing the certificate ordering the IdP was successfully added to AWS.

Configuring an IAM Role for GitLab to Assume

To leverage the connection between AWS and GitLab, we needed to create a new role. The role needs a Trust relationship configured to allow GitLab to use it.

We started with a default trust policy to make sure things were working:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement1",
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/GITLAB_FQDN"
            },
            "Action": "sts:AssumeRoleWithWebIdentity"
        }
    ]
}

And attached an AWS managed permission policy AWSLambda_FullAccess to get started.

Be sure to replace the ACCOUNT_ID and GITLAB_FQDN placeholders.

Note: Try and follow a standard/best practice when naming roles.

Configuring Terraform to Assume the Role

This took a bit of experimenting. Initially we were trying to configure authentication in the Terraform files, but it seems simpler to rely on environment variables.

So our Terraform config is just:

terraform {
  backend "http" {}
}

provider "aws" {}

The two required environment variables are:

  • AWS_ROLE_ARN - The ARN from the IAM role created earlier.
  • AWS_WEB_IDENTITY_TOKEN_FILE - The path to a file containing the identity token (JWT) generated by GitLab (in the id_tokens job definition).

AWS_REGION is also defined globally.

The environment variables can be set in the .gitlab-ci.yml directly or as project/group/instance variables (or in a number of other ways).

We created an instance level CI variable for AWS_WEB_IDENTITY_TOKEN_FILE:

AWS_WEB_IDENTITY_TOKEN_FILE project CI variable settings

But also tested doing it in the .gitlab-ci.yml with:

- echo "${OIDC_TOKEN}" > /.oidc_token
- export AWS_WEB_IDENTITY_TOKEN_FILE="/.oidc_token"

Perhaps there is a better way? But I think it needs to be written to a file first.

The resulting .gitlab-ci.yml looks something like:

variables:
  AWS_REGION: eu-west-1
  AWS_ROLE_ARN: arn:aws:iam::ACCOUNT_ID:role/ROLE_NAME
  TF_ROOT: ""
  TF_STATE_NAME: default

terraform:
  image: "$CI_TEMPLATE_REGISTRY_HOST/gitlab-org/terraform-images/stable:latest"
  id_tokens:
    OIDC_TOKEN:
      aud: https://GITLAB_FQDN
  script:
    - gitlab-terraform fmt
    - gitlab-terraform validate
    - gitlab-terraform plan
    - gitlab-terraform plan-json
    - gitlab-terraform apply
  artifacts:
    public: false
    paths:
      - ${TF_ROOT}/plan.cache
    reports:
      terraform: ${TF_ROOT}/plan.json

Be sure to replace the ACCOUNT_ID, ROLE_NAME, and GITLAB_FQDN placeholders.

Configuring the AWS CLI to Assume the Role

Essentially this is exactly the same as Terraform. The AWS CLI will pickup the AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE environment variables.

The resulting .gitlab-ci.yml looks something like:

variables:
  AWS_REGION: eu-west-1
  AWS_ROLE_ARN: arn:aws:iam::ACCOUNT_ID:role/ROLE_NAME

lambda-deploy:
  image:
    name: amazon/aws-cli
    entrypoint: [""]
  id_tokens:
    OIDC_TOKEN:
      aud: https://GITLAB_FQDN
  script:
    - aws lambda update-function-code --function-name lf --zip-file fileb://lf.zip

Remembering to replace the ACCOUNT_ID, ROLE_NAME, and GITLAB_FQDN placeholders.

Limiting Resource Access by GitLab Project

The above is all great, but we need to restrict which AWS resources each project can access.

No problem, you can add conditions to the AWS role trust relationship (to limit which project can assume the identity):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement1",
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/GITLAB_FQDN"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringLike": {
                    "GITLAB_FQDN:sub": "project_path:GROUP/PROJECT:*"
                }
            }
        }
    ]
}

And you can specify the resources in the permission policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "DeployLambda",
            "Effect": "Allow",
            "Action": [
                "lambda:UpdateFunctionCode",
                "lambda:GetFunction"
            ],
            "Resource": "arn:aws:lambda:REGION:ACCOUNT_ID:function:FUNCTION_NAME*"
        }
    ]
}

But we don’t want to manually create a new role for every project.

What if we add a GitLabProject tag to all resources which matches the GitLab project path?

I spent a good amount of time trying to create a dynamic role to filter resource access based on the GitLab project and AWS tag, but am fairly confident it isn’t possible.

So let’s find a way to automatically create the roles?

Automating Per Project Role Creation

Lots of trial and error lead us to the following .gitlab-ci.yml to automate creation and deletion of the project specific roles:

include:
  - component: $CI_SERVER_FQDN/components/release/release@master
    rules:
      - if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH

workflow:
  name: $PIPELINE_NAME
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
      variables:
        PIPELINE_NAME: Merge request
    - if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH
      variables:
        PIPELINE_NAME: Release
    - if: $CREATE_ROLE == "1"
      variables:
        PIPELINE_NAME: Create role for $PROJECT_PATH
    - if: $DELETE_ROLE == "1"
      variables:
        PIPELINE_NAME: Delete role for $PROJECT_PATH

.base:
  stage: deploy
  image:
    name: amazon/aws-cli
    entrypoint: [""]
  id_tokens:
    OIDC_TOKEN:
      aud: https://GITLAB_FQDN
  variables:
    AWS_ROLE_ARN: arn:aws:iam::ACCOUNT_ID:role/ROLE_CREATION_ROLE_NAME
    POLICY_NAME: "gitlab-temp-policy-${PROJECT_ID}"
    ROLE_NAME: "gitlab-temp-role-${PROJECT_ID}"

create-role:
  extends: .base
  rules:
    - if: $CREATE_ROLE == '1'
  script:
    - |
      aws iam create-role \
        --role-name "${ROLE_NAME}" \
        --assume-role-policy-document '{
          "Version": "2012-10-17",
          "Statement": [{
            "Effect": "Allow",
            "Principal": {
              "Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/GITLAB_FQDN"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
              "StringLike": {
                "GITLAB_FQDN:sub": "project_path:'${PROJECT_PATH}':*"
              }
            }
          }]
        }'      
    - |
      aws iam put-role-policy \
        --role-name "${ROLE_NAME}" \
        --policy-name "${POLICY_NAME}" \
        --policy-document '{
          "Version": "2012-10-17",
          "Statement": [{
            "Effect": "Allow",
            "Action": ["lambda:UpdateFunctionCode","lambda:GetFunction"],
            "Resource": "*",
              "Condition": {
                "StringEquals": {
                  "aws:ResourceTag/GitLabProject": "'${PROJECT_PATH}'"
                }
              }
          }]
        }'      

delete-role:
  extends: .base
  rules:
    - if: $DELETE_ROLE == '1'
  script:
    - |
      aws iam delete-role-policy \
        --role-name "${ROLE_NAME}" \
        --policy-name "${POLICY_NAME}"      
    - aws iam delete-role --role-name "${ROLE_NAME}"

And the corresponding CI component:

spec:
  inputs:
    allow_failure:
      type: boolean
      default: false
    aws_account_id:
      default: "ACCOUNT_ID"
    post_stage:
      default: .post
    pre_stage:
      default: .pre

---
create-temporary-role:
  stage: $[[ inputs.pre_stage ]]
  trigger:
    project: components/aws_authenticate
    strategy: depend
  variables:
    PROJECT_ID: $CI_PROJECT_ID
    PROJECT_PATH: $CI_PROJECT_PATH
    CREATE_ROLE: 1
  inherit:
    variables: false
  allow_failure: $[[ inputs.allow_failure ]]

remove-temporary-role:
  stage: $[[ inputs.post_stage ]]
  trigger:
    project: components/aws_authenticate
    strategy: depend
  variables:
    PROJECT_ID: $CI_PROJECT_ID
    PROJECT_PATH: $CI_PROJECT_PATH
    DELETE_ROLE: 1
  inherit:
    variables: false
  allow_failure: $[[ inputs.allow_failure ]]
  when: always

Replacing ACCOUNT_ID, GITLAB_FQDN, and ROLE_CREATION_ROLE_NAME. The latter of which should have the following permission policy defined in AWS:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "iam:CreateRole",
                "iam:DeleteRole",
                "iam:PutRolePolicy",
                "iam:DeleteRolePolicy"
            ],
            "Resource": "*"
        }
    ]
}

And trust relationship to only allow the role management project to assume it:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement1",
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/GITLAB_FQDN"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringLike": {
                    "GITLAB_FQDN:sub": "project_path:components/aws_authenticate:*"
                }
            }
        }
    ]
}

The End Result

  • A project to manage AWS infrastructure (with strict merge request approval rules).
  • A project to create (and destroy) ephemeral AWS IAM roles (with strict merge request approval rules).
  • Many projects each deploying their own AWS functions/apps.
  • A CI component to easily tie these things together.

Here’s how a project consuming the CI component looks:

include:
  - component: $CI_SERVER_FQDN/components/aws_authenticate/aws-authenticate@master

variables:
  AWS_ROLE_ARN: arn:aws:iam::ACCOUNT_ID:role/gitlab-temp-role-${CI_PROJECT_ID}

deploy:
  stage: deploy
  tags:
    - docker
  id_tokens:
    OIDC_TOKEN:
      aud: https://GITLAB_FQDN
  image:
    name: amazon/aws-cli
    entrypoint: [""]
  needs:
    - job: create-temporary-role
  when: always
  script:
    - aws lambda update-function-code --function-name lf --zip-file fileb://lf.zip

Pipeline with ephemeral IAM role creation and deletion

What’s Next?

The above all need to be locked down by branch!

Shout out to Patrick Rice of the GitLab Core team whom I bounced a bunch of ideas around with!