Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for assuming IAM role with web identity token #2585

Open
srisudarsan opened this issue May 31, 2023 · 11 comments · May be fixed by #2997
Open

Support for assuming IAM role with web identity token #2585

srisudarsan opened this issue May 31, 2023 · 11 comments · May be fixed by #2997
Labels
enhancement New feature or request

Comments

@srisudarsan
Copy link

Terragrunt currently only supports assume role for IAM roles, With the introduction of OIDC providers, we can assume IAM role with web identity token, which is officially supported by https://github.com/aws-actions/configure-aws-credentials

With iam_role in terragrunt, can we also have support to assume IAM role with web identity token with the token or the token file passed as an additional input ?

If this can be supported, we can directly run teragrunt on github actions without a need for using configure-aws-credentials action. This helps in assuming maintaining multiple roles in multiple modules

@srisudarsan srisudarsan added the enhancement New feature or request label May 31, 2023
@mrines
Copy link

mrines commented Jun 23, 2023

Yes please

@Cwagne17
Copy link

Yes! I am having to use a workaround right now. I use OIDC for my CICD provider and this would be huge!

@syphernl
Copy link

Yes! I am having to use a workaround right now. I use OIDC for my CICD provider and this would be huge!

Would you mind sharing your workaround?

@botagar
Copy link

botagar commented Jul 27, 2023

This is a big thing for me.
We have a complex setup of accounts and being able to utilise OIDC will greatly reduce the overhead needed to maintaining cross account relationships.

@syphernl
Copy link

I'd like to share our workaround. We have this in our root terragrunt.hcl:

# Generate an AWS provider block
generate "provider" {
  path      = "provider.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<EOF
variable "oidc_web_identity_token" {
  type        = string
  description = "OIDC Web Identity Token used by AWS Provider."
  default     = null
}

provider "aws" {
  region = "${local.region}"

  assume_role_with_web_identity {
    role_arn           = "arn:aws:iam::${var.account_id}:role/${var.role_name}"
    web_identity_token = var.oidc_web_identity_token
  }
}
EOF
}

Before we trigger the pipeline we write the OIDC token we obtained to terraform.auto.tfvars which gets read by Terraform and thus fills in the oidc_web_identity_token var.

echo "oidc_web_identity_token = \"${OIDC_TOKEN}\"" > terraform.auto.tfvars

This is not native Terragrunt like iam_role is but it gets the job done. We rolled this out last week and it works perfectly for our use-case.

@Cwagne17
Copy link

Cwagne17 commented Jul 28, 2023

@botagar @syphernl I ended up actually finding this in the Terraform documentation. Dynamic Provider Credentials.

This allowed me to setup terraform cloud as a custom Identity Provider in AWS, then when I ran my terragrunt configuration with the terraform cloud (using terraform cloud API token) I didn't have to do any setup of the AWS credentials. The IAM role used for authentication was setup in the project variables.

Similarly this should be possible with a CICD provider such as GitHub actions or CircleCI.

Check out the proof of concept I worked on. Terragrunt TFC integration
You can see that in my .circleci configuration I didn't have to set the AWS credentials up. Nor did I need to in the live/terragrunt.hcl file.

@jhrr
Copy link

jhrr commented Oct 3, 2023

If this can be supported, we can directly run teragrunt on github actions without a need for using configure-aws-credentials action.

Has anyone actually even managed to get Terragrunt to assume the OIDC role along with using aws-actions/configure-aws-credentials?

I have not had any success and the same seems to be true in this issue (dating from back in 2021 no less). If you've managed to get it to work it would be great to know how you did it.

So, even having Terragrunt work with OIDC via configure-aws-credentials without any workarounds would be good for me. OIDC is now pretty much the recommended way to grant temporary auth with your provider in github actions workflows so this really ought to work cleanly with Terragrunt.

Is there any way we promote this issue in priority? I'm not a Go developer myself but if anyone has any insight in how to patch this maybe we can make an effort work on it.

Thanks for all the effort on the project, it's very much appreciated.

@botagar
Copy link

botagar commented Oct 4, 2023

I couldn't get TG to use TF remote state using OIDC.
I created a small but realistic test stack and set the provider to be TF 1.6-beta3 but TG never seemed to try access the state in the correct account. I verified that the generated backend block was all correct and was able to see that the role I has setup was assumed, but then TG dropped the ball in the last stretch by trying to access the state bucket in our Agent Hosting account (which doesn't exist). All things I confirmed through cloudwatch and cloudtrail.

Same stack without TG, i was able to use TF remote state via Assume Web Role AND deploy resources across accounts via Assume Web Role.
I used assume_role_with_web_identity in the provider block, and the same in the back-end block.
See TF 1.6 RC1 here
As for passing the token into that block, in our pipeline we developed a small utility (node app) to get the OIDC token and turned it into a shared GH action.
In the Job, we save the token to a file and have that file path set in the assume_role_with_web_identity block.

As for what we're doing right now for the rest of our stuff already deeply in TG, we're breaking out the jobs into their respective target AWS accounts and calling aws-actions/configure-aws-credentials in each Job for each account role.
We then target the appropriate directories for each account using the --terragrunt-include-dir parameter.

This is a major pain though, and defeats one of the main reasons we introduced TG into our tech stack in the first place.
The promise of being able to just shotgun all our infra out there in a single command and have the whole dependency graph taken care of for us was quite the selling point, but now it's another layer ontop of what we were doing back when we were pure TF.
If OIDC support isn't going to be even considered in TG for a while, we might need to consider going back to pure TF.
This would be very unfortunate as we just wrapped up moving everything into TG, but well, the universe loves pulling pranks.

I don't see much discussion happening here on this feature tbh, so maybe we're a tiny minority who want to use OIDC.
OIDC or Assume Web Identity aren't anywhere in the Roadmap so I don't think we should hold our breath for this feature.

I too am not a GO dev and I just haven't had the time to be able to sit down with the TG codebase to understand what's going on and how it does all it's role assumptions.

TG as a tool so far has been great. But no OIDC is a bit of a show stopper.

@jhrr
Copy link

jhrr commented Oct 4, 2023

Thanks for the reply @botagar. I can't say I disagree with any of what you say in general, it's definitely a bit of a weird oversight that OIDC doesn't work yet transparently and I can see how it might be nearly a deal breaker and very frustrating for anyone who has sunk a lot of time and effort into Terragrunt and then discovers OIDC is not a first-class citizen.

Yesterday though after a bit of a slog I did manage to get Terragrunt running with OIDC using the same kind of method as described above with the assume_role_with_web_identity block in the provider. I had to make some alterations to get it working though so I'll post a summary of the code for posterity (the source is not OSS yet but it will be eventually) in case anyone else needs OIDC in their workflow.

# terragrunt.hcl

generate "provider" {
  path      = "provider.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<EOF
# Workarounds for terragrunt's (current) inability to handle OIDC role chaining.
variable "oidc_session_name" {
  description = "The OIDC session name."
  type        = string
  default     = null
}

variable "oidc_web_identity_token" {
  description = "The OIDC session token."
  type        = string
  default     = null
}

provider "aws" {
  allowed_account_ids = ["${var.aws_account}"]
  region = "${var.aws_region}"

  dynamic "assume_role_with_web_identity" {
    for_each = var.oidc_web_identity_token == null ? {} : { oidc_enabled = true }

    content {
      role_arn                = "arn:aws:iam::${var.aws_account}:role/${var.gha_oidc_role_name}"
      session_name            = var.oidc_session_name
      web_identity_token      = var.oidc_web_identity_token
    }
  }
EOF
}

Here we use variables with default null values and a dynamic block for assume_role_with_web_identity. I had to come up with this extra check using dynamic otherwise I would end up with a validation error where Terraform complains that either web_identity_token or web_identity_token_file have to be set in the assume_role_with_web_identity block. By using dynamic we can get around this in a context-sensitive way while still having the variable set to null which is clean.

Then as described by the OP, in the pipeline what we will do is write oidc_session_name and oidc_web_identity_token to terraform.auto.tfvars and then in the CI/CD context those values will be picked up by Terraform and the assume_role_with_web_identity block will be used.

So make sure you have a something.auto.tfvars file somewhere in your working tree without those values in it that you can use for this purpose. If we are just running terraform locally then it will fallback to whatever credentials you have configured (no autovars). Obviously you'll need to configure your OIDC provider and role so that the ARN in role_arn matches whatever you inject via the pipeline.

In your Github actions (for example) we need to do the rest of the dynamic injection and call Terragrunt. (This was adapted from a larger workflow file so forgive any typos or mistakes.)

# deploy.yml

name: Provision
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
permissions:
  contents: write
  id-token: write  # OIDC permission.
env:
  ACCOUNT: 12345678
  REGION: us-east-1
  ROLE: gha-oidc-role.  # Make sure this matches your OIDC role name in terraform. 
  ROLE_SESSION: GithubActionsProvision
  TF_ROOT: ./tf  # Wherever the root of your terraform configuration is located in your repo.

jobs:
  provision:
    name: 'Provision via OIDC'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - id: set-oidc-role
        run: |
          echo "ROLE=arn:aws:iam::${{ env.ACCOUNT }}:role/${{ env.ROLE }}" >> $GITHUB_OUTPUT

      - uses: aws-actions/configure-aws-credentials@v4
        id: oidc-credentials
        with:
          aws-region: ${{ env.REGION }}
          role-to-assume: ${{ steps.set-oidc-role.outputs.ROLE }}
          role-session-name: ${{ env.ROLE_SESSION }}
          output-credentials: true

      - id: write-oidc-vars
        env:
          AUTO_VARS: terraform.auto.tfvars  # The name of your autovars file.
        run: |
          vars=${{ env.TF_ROOT }}/${{ env.AUTO_VARS }}
          token=${{ steps.oidc-credentials.outputs.aws-session-token }}
          session=${{ env.ROLE_SESSION }}
          echo "oidc_session_name = \"${session}\"" >> ${vars}
          echo "oidc_web_identity_token = \"${token}\"" >> ${vars}

      - uses: hashicorp/setup-terraform@v2
        with:
          terraform_wrapper: false

      - id: setup-terragrunt
        uses: autero1/action-terragrunt@latest
        with:
          token: ${{ secrets.GITHUB_TOKEN }}

      - id: tf-plan-oidc
        env:
           STACK: staging  # Set this to whatever configuration directory you need.
        run: terragrunt run-all plan --terragrunt-working-dir ${{ env.TF_ROOT }}/${{ env.STACK }}

      - id: tf-apply-oidc
        env:
           STACK: staging  # Set this to whatever configuration directory you need.
        run: terragrunt run-all apply --terragrunt-working-dir ${{ env.TF_ROOT }}/${{ env.STACK }}

Obviously this is quite a lot of heavy lifting to have to do, but it works and Terragrunt is provisioning via OIDC in the pipeline.

I wouldn't try and provision an entirely new stack in this way. I suppose it might be possible to do so (although obviously it wouldn't work unless you provisioned the OIDC resources at least beforehand), but I build my stack out manually first module by module before activating any kind of automation over it. The workflow above is intended to be used only for updates during a CI/CD process on an existing set of resources. So, just ensure the OIDC role has the correct permissions to access the dynamodb tables and buckets with state that already exist and with something like the above you should be off to the races.

@botagar
Copy link

botagar commented Oct 4, 2023

Thanks for sharing that @jhrr !
I got to a similar place to where you got to, I stopped before playing around with dynamic blocks though.
I hadn't actually thought of using the dynamic blocks that way.

If first class OIDC support were at the very least acknowledged on the roadmap, we could probably stick with the interim solution we have now.

Maybe it's time to learn a little GO too... 😅

@partcyborg partcyborg linked a pull request Mar 13, 2024 that will close this issue
4 tasks
@partcyborg
Copy link

FYI: I just sent out a PR that implements this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants