Connecting GitLab as an AWS Identity Provider and Improving CI/CD and Security
In a previous blog post we documented our first attempt. Having learnt a bit more, this updated blog post describes our latest setup.
Configuring GitLab as an Identity Provider (IdP) for AWS
Our GitLab instance was previously running on a non-standard port which prevented us from using it as an IdP for AWS. We have since moved to using the standard HTTPS port and setup the IdP connection.
The process was pretty straight-forward:
- In AWS, open IAM > Identity Providers.
- Select Add provider.
- Select
OpenID Connect
as the Provider type. - Enter the GitLab instance root URL in Provider URL and Audience.
This didn’t work on first attempt- I think we were presented with the error:
Identity provider was not added.
Could not connect to https://gitlab.domain.tld
Fortunately the GitLab docs for configuring OpenID connect in AWS have documented some common troubleshooting steps. A common problem arises when the GitLab instance’s certificate chain is not in the correct order.
After fixing the certificate ordering the IdP was successfully added to AWS.
Configuring an IAM Role for GitLab to Assume
To leverage the connection between AWS and GitLab, we needed to create a new role. The role needs a Trust relationship configured to allow GitLab to use it.
We started with a default trust policy to make sure things were working:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/GITLAB_FQDN"
},
"Action": "sts:AssumeRoleWithWebIdentity"
}
]
}
And attached an AWS managed permission policy AWSLambda_FullAccess
to get started.
Be sure to replace the ACCOUNT_ID
and GITLAB_FQDN
placeholders.
Note: Try and follow a standard/best practice when naming roles.
Configuring Terraform to Assume the Role
This took a bit of experimenting. Initially we were trying to configure authentication in the Terraform files, but it seems simpler to rely on environment variables.
So our Terraform config is just:
terraform {
backend "http" {}
}
provider "aws" {}
The two required environment variables are:
AWS_ROLE_ARN
- The ARN from the IAM role created earlier.AWS_WEB_IDENTITY_TOKEN_FILE
- The path to a file containing the identity token (JWT) generated by GitLab (in theid_tokens
job definition).
AWS_REGION
is also defined globally.
The environment variables can be set in the .gitlab-ci.yml
directly or as
project/group/instance variables (or in a number of other ways).
We created an instance level CI variable for AWS_WEB_IDENTITY_TOKEN_FILE
:
But also tested doing it in the .gitlab-ci.yml
with:
- echo "${OIDC_TOKEN}" > /.oidc_token
- export AWS_WEB_IDENTITY_TOKEN_FILE="/.oidc_token"
Perhaps there is a better way? But I think it needs to be written to a file first.
The resulting .gitlab-ci.yml
looks something like:
variables:
AWS_REGION: eu-west-1
AWS_ROLE_ARN: arn:aws:iam::ACCOUNT_ID:role/ROLE_NAME
TF_ROOT: ""
TF_STATE_NAME: default
terraform:
image: "$CI_TEMPLATE_REGISTRY_HOST/gitlab-org/terraform-images/stable:latest"
id_tokens:
OIDC_TOKEN:
aud: https://GITLAB_FQDN
script:
- gitlab-terraform fmt
- gitlab-terraform validate
- gitlab-terraform plan
- gitlab-terraform plan-json
- gitlab-terraform apply
artifacts:
public: false
paths:
- ${TF_ROOT}/plan.cache
reports:
terraform: ${TF_ROOT}/plan.json
Be sure to replace the ACCOUNT_ID
, ROLE_NAME
, and GITLAB_FQDN
placeholders.
Configuring the AWS CLI to Assume the Role
Essentially this is exactly the same as Terraform.
The AWS CLI will pickup the AWS_ROLE_ARN
and AWS_WEB_IDENTITY_TOKEN_FILE
environment variables.
The resulting .gitlab-ci.yml
looks something like:
variables:
AWS_REGION: eu-west-1
AWS_ROLE_ARN: arn:aws:iam::ACCOUNT_ID:role/ROLE_NAME
lambda-deploy:
image:
name: amazon/aws-cli
entrypoint: [""]
id_tokens:
OIDC_TOKEN:
aud: https://GITLAB_FQDN
script:
- aws lambda update-function-code --function-name lf --zip-file fileb://lf.zip
Remembering to replace the ACCOUNT_ID
, ROLE_NAME
, and GITLAB_FQDN
placeholders.
Limiting Resource Access by GitLab Project
The above is all great, but we need to restrict which AWS resources each project can access.
No problem, you can add conditions to the AWS role trust relationship (to limit which project can assume the identity):
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/GITLAB_FQDN"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringLike": {
"GITLAB_FQDN:sub": "project_path:GROUP/PROJECT:*"
}
}
}
]
}
And you can specify the resources in the permission policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DeployLambda",
"Effect": "Allow",
"Action": [
"lambda:UpdateFunctionCode",
"lambda:GetFunction"
],
"Resource": "arn:aws:lambda:REGION:ACCOUNT_ID:function:FUNCTION_NAME*"
}
]
}
But we don’t want to manually create a new role for every project.
What if we add a GitLabProject
tag to all resources which matches the GitLab
project path?
I spent a good amount of time trying to create a dynamic role to filter resource access based on the GitLab project and AWS tag, but am fairly confident it isn’t possible.
So let’s find a way to automatically create the roles?
Automating Per Project Role Creation
Lots of trial and error lead us to the following .gitlab-ci.yml
to automate creation
and deletion of the project specific roles:
include:
- component: $CI_SERVER_FQDN/components/release/release@master
rules:
- if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH
workflow:
name: $PIPELINE_NAME
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
variables:
PIPELINE_NAME: Merge request
- if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH
variables:
PIPELINE_NAME: Release
- if: $CREATE_ROLE == "1"
variables:
PIPELINE_NAME: Create role for $PROJECT_PATH
- if: $DELETE_ROLE == "1"
variables:
PIPELINE_NAME: Delete role for $PROJECT_PATH
.base:
stage: deploy
image:
name: amazon/aws-cli
entrypoint: [""]
id_tokens:
OIDC_TOKEN:
aud: https://GITLAB_FQDN
variables:
AWS_ROLE_ARN: arn:aws:iam::ACCOUNT_ID:role/ROLE_CREATION_ROLE_NAME
POLICY_NAME: "gitlab-temp-policy-${PROJECT_ID}"
ROLE_NAME: "gitlab-temp-role-${PROJECT_ID}"
create-role:
extends: .base
rules:
- if: $CREATE_ROLE == '1'
script:
- |
aws iam create-role \
--role-name "${ROLE_NAME}" \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/GITLAB_FQDN"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringLike": {
"GITLAB_FQDN:sub": "project_path:'${PROJECT_PATH}':*"
}
}
}]
}'
- |
aws iam put-role-policy \
--role-name "${ROLE_NAME}" \
--policy-name "${POLICY_NAME}" \
--policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": ["lambda:UpdateFunctionCode","lambda:GetFunction"],
"Resource": "*",
"Condition": {
"StringEquals": {
"aws:ResourceTag/GitLabProject": "'${PROJECT_PATH}'"
}
}
}]
}'
delete-role:
extends: .base
rules:
- if: $DELETE_ROLE == '1'
script:
- |
aws iam delete-role-policy \
--role-name "${ROLE_NAME}" \
--policy-name "${POLICY_NAME}"
- aws iam delete-role --role-name "${ROLE_NAME}"
And the corresponding CI component:
spec:
inputs:
allow_failure:
type: boolean
default: false
aws_account_id:
default: "ACCOUNT_ID"
post_stage:
default: .post
pre_stage:
default: .pre
---
create-temporary-role:
stage: $[[ inputs.pre_stage ]]
trigger:
project: components/aws_authenticate
strategy: depend
variables:
PROJECT_ID: $CI_PROJECT_ID
PROJECT_PATH: $CI_PROJECT_PATH
CREATE_ROLE: 1
inherit:
variables: false
allow_failure: $[[ inputs.allow_failure ]]
remove-temporary-role:
stage: $[[ inputs.post_stage ]]
trigger:
project: components/aws_authenticate
strategy: depend
variables:
PROJECT_ID: $CI_PROJECT_ID
PROJECT_PATH: $CI_PROJECT_PATH
DELETE_ROLE: 1
inherit:
variables: false
allow_failure: $[[ inputs.allow_failure ]]
when: always
Replacing ACCOUNT_ID
, GITLAB_FQDN
, and ROLE_CREATION_ROLE_NAME
.
The latter of which should have the following permission policy defined in AWS:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"iam:CreateRole",
"iam:DeleteRole",
"iam:PutRolePolicy",
"iam:DeleteRolePolicy"
],
"Resource": "*"
}
]
}
And trust relationship to only allow the role management project to assume it:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/GITLAB_FQDN"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringLike": {
"GITLAB_FQDN:sub": "project_path:components/aws_authenticate:*"
}
}
}
]
}
The End Result
- A project to manage AWS infrastructure (with strict merge request approval rules).
- A project to create (and destroy) ephemeral AWS IAM roles (with strict merge request approval rules).
- Many projects each deploying their own AWS functions/apps.
- A CI component to easily tie these things together.
Here’s how a project consuming the CI component looks:
include:
- component: $CI_SERVER_FQDN/components/aws_authenticate/aws-authenticate@master
variables:
AWS_ROLE_ARN: arn:aws:iam::ACCOUNT_ID:role/gitlab-temp-role-${CI_PROJECT_ID}
deploy:
stage: deploy
tags:
- docker
id_tokens:
OIDC_TOKEN:
aud: https://GITLAB_FQDN
image:
name: amazon/aws-cli
entrypoint: [""]
needs:
- job: create-temporary-role
when: always
script:
- aws lambda update-function-code --function-name lf --zip-file fileb://lf.zip
What’s Next?
The above all need to be locked down by branch!
Shout out to Patrick Rice of the GitLab Core team whom I bounced a bunch of ideas around with!