Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

token is not refreshed when running in k8s pod connected to serviceaccount #736

Open
coolstim opened this issue Feb 7, 2024 · 8 comments
Labels
bug Something isn't working

Comments

@coolstim
Copy link

coolstim commented Feb 7, 2024

Mountpoint for Amazon S3 version

1.1.0

AWS Region

us-east-1

Describe the running environment

Running on EKS
mount-s3 is running in a container part of pod that uses a serviceaccount.
The serviceaccount is annotated with eks.amazonaws.com/role-arn=arn:aws:iam::xxx:role/mounts3role
The mounts3role has the needed permissions on an s3 bucket

Expected behavior:
mount-s3 retains access to the s3 bucket even when the process runs longer than the token expiration time.

Actual behavior:
mount-s3 loses access to the s3 bucket when the process runs longer than the token expiration time.
It seems that the token refresh is not implemented correctly

Mountpoint options

mount-s3 -f test /s3/bucket --allow-other --auto-unmount --read-only --region us-east-1 --prefix source-data/

What happened?

mount-s3 loses access to the s3 bucket when the process runs longer than the token expiration time.

Relevant log output

No response

@coolstim coolstim added the bug Something isn't working label Feb 7, 2024
@sauraank
Copy link
Contributor

sauraank commented Feb 7, 2024

Can you please share the logs around the time mount-s3 loses access to the s3 bucket. What is the error that you receive? Thanks.

@coolstim
Copy link
Author

coolstim commented Feb 7, 2024

i don't have the logs anymore, but i received an http 403 error
if you really need them i can probably simulate it again

@sauraank
Copy link
Contributor

sauraank commented Feb 7, 2024

Yes. Could you please simulate it again, and provide the logs?
Please run mount-s3 with --debug CLI flag. You can check more details on logging at our logging page.

@dannycjones
Copy link
Contributor

Hey @coolstim. Please use the --debug-crt flag as well as --debug. This will provide us with more detail on what the underlying AWS Common Runtime (CRT) client is doing and if it attempts to renew the token.

@coolstim
Copy link
Author

atm we abandoned the use of mountpoint-s3 because of performance issues, we now use lustre fsx with a data repository association to s3 which performs a whole lot better.
If we ever get back to this i will rerun the it with the given flags and post my findings here

@dannycjones
Copy link
Contributor

Glad you were able to move forward for your use case. I do want to make sure we solve this issue for anyone else who may face it, so I will leave this open for now and we'll investigate further on our side.

mount-s3 loses access to the s3 bucket when the process runs longer than the token expiration time.

Are you able to confirm which token you believed was expiring? Was it the web identity token, the IAM session, or unclear?

@coolstim
Copy link
Author

unclear tbh

@dannycjones
Copy link
Contributor

I don't have anything to share on investigating this issue right now, but I'm noting down some of the thoughts @vladem and I had last week on this issue.


  • EKS provides the container with AWS_WEB_IDENTITY_TOKEN_FILE environment variable (e.g. described here), CRT does appear to support it in aws_credentials_provider_sts_web_identity_options
  • When providing the web identity token in the AWS_WEB_IDENTITY_TOKEN_FILE file, we expect the CRT to periodically exchange that for AWS credentials using the API AssumeRoleWithWebIdentity.
  • The web identity token can have an expiry, and so this also needs to be refreshed from time to time - presumably by re-reading from the file.
  • What could conceivably be going wrong? (i.e. we need to double check that these are working as expected)
    • Maybe we're not fetching the token from AWS_WEB_IDENTITY_TOKEN_FILE at the correct cadence.
    • Maybe we're not fetching new AWS credentials when they expire.

Next steps

  • We should try and reproduce the issue. Maybe use an artificially short session so we can force the AWS credentials refresh quickly. How can we create the web identity token? Maybe just go and create a proper EKS cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants