Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utils' 'download_folder' function doesn't use sagemaker session's s3 resource #4663

Open
DarrenStack opened this issue May 8, 2024 · 0 comments

Comments

@DarrenStack
Copy link

DarrenStack commented May 8, 2024

Describe the bug
Hi all!

I'm using Sagemaker and overriding the S3 resource to use LakeFS similar to how is shown here where I'm using different AWS credentials for the S3 endpoint. Mostly, all is going well however I have run into an issue when running any processing/training that uses the 'download_folder' function from the utils package under the hood.

I'm providing my 'sagemaker_session' to the following function with my custom S3 resource however the download folder code is then instantiating a new S3 resource that does not have the correct keys or endpoint configuration.

s3 = boto_session.resource("s3", region_name=boto_session.region_name)

I might be missing something but could the sagemaker_session.s3_resource be used above instead? When a Session is created, it looks like it's initializing the same s3 resource by default

self.s3_resource = self.boto_session.resource("s3", region_name=self.boto_region_name)

To reproduce
Create a sagemaker session and overwrite the S3 recourse after initializing the session.

s3_resource = boto3.resource('s3',
    endpoint_url=different_endpoint_url,
    aws_access_key_id=different_aws_access_key_id,
    aws_secret_access_key=different_aws_secret_access_key)

session = sagemaker.Session(
    boto3.Session(),
    s3_endpoint_url=different_endpoint_url,
)

session.s3_resource = s3_resource

utils.download_folder('bucket', 'path', 'target', session)

Expected behavior
The session's s3 resource that was assigned would be used.

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.217.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans):
  • Framework version:
  • Python version: 3.8
  • CPU or GPU:
  • Custom Docker image (Y/N):

Additional context
Add any other context about the problem here.

@DarrenStack DarrenStack changed the title Utils' 'download_folder' function doesn't use sagemaker sessions' Utils' 'download_folder' function doesn't use sagemaker sessions' s3 resource May 8, 2024
@DarrenStack DarrenStack changed the title Utils' 'download_folder' function doesn't use sagemaker sessions' s3 resource Utils' 'download_folder' function doesn't use sagemaker session's s3 resource May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant