Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] Quilt and min.io? #1941

Open
matheusmota opened this issue Nov 27, 2020 · 15 comments
Open

[question] Quilt and min.io? #1941

matheusmota opened this issue Nov 27, 2020 · 15 comments

Comments

@matheusmota
Copy link

Hi there. Is it or will it be possible to use quilt with on-premises s3-based solutions like min.io? AWS/cloud may be unavailable in some scenarios.

Thanks!

@akarve
Copy link
Member

akarve commented Nov 27, 2020

It is possible and we have min.io on the roadmap. We invite you to try Quilt with a min.io endpoint and file bugs that you encounter, as we have yet to formalize support. In theory the min.io API means that it just works, in practice it's not that simple due to assumptions in the code and/or missing features in min.io.

@matheusmota
Copy link
Author

Glad to hear that. I will definitely try it and let you know the results.

Thanks

@matheusmota
Copy link
Author

One suggestion to encourage more people interested in testing it is to provide a small how-to.

@akarve
Copy link
Member

akarve commented Nov 27, 2020 via email

@Midnighter
Copy link

@matheusmota I know your issue is only from six days ago but have you already given this a shot? Did you by any chance take some notes that you are willing to share if you started on this?

Either way, I will try to set up MinIO on a local node and address it with quilt in the coming days.

@Janus-Xu
Copy link

waiting for minio, grate jobs

@zerafachris
Copy link

@matheusmota @Midnighter Any updates on Min.IO? Maybe you can share your experience? I am currently considering giving quilt a go, but only have mio.io available

@Midnighter
Copy link

I briefly tried and was not successful. I haven't been able to give it a more serious attempt since then.

@marcodlk
Copy link

marcodlk commented Nov 1, 2022

@akarve I am trying to establish Quilt as a core component of the data infrastructure at our research org. AWS is a non-starter for us so I am attempting to slowly fill in the AWS-dependent gaps with MinIO compatibility starting with the quilt3 python package - initially as a standalone that does not rely on a registry server. I quickly hacked together a solution that mainly just involves modifying the S3ClientProvider._build_client method to create a client with endpoint_url specified. Currently I just check an environment variable for the endpoint url and if it exists, create the client with the endpoint url, otherwise the same old way.

quilt3/datatransfer.py

class S3ClientProvider:
    ...
    def _build_client(self, get_config):
        session = self.get_boto_session()
        endpoint_url = getenv_s3_endpoint_url()
        if endpoint_url:
            return session.client(
                's3',
                config=Config(signature_version='s3v4'),
                endpoint_url=endpoint_url,
            )
        return session.client('s3', config=get_config(session))

As far as credentials, I currently edit the CREDENTIALS_PATH file with MinIO user credentials and it works fine.

Now this is just a starting implementation and far from optimal, but I'm wondering if this standalone MinIO-compatible mode is something that you're interested in supporting in the quilt3 python package and if you have any ideas as far as things to consider in the design.

Thanks!

@akarve
Copy link
Member

akarve commented May 31, 2023

@marcodlk nice workaround and directionally correct (sorry for the slow reply). what we're planning to do here is in the next-gen client (already in the works and will be open source) to abstract the providers a little bit so that at first any object-compatible store can be interposed (GCP, Azure, MinIO) so that's the long term solution and we don't have code just yet. wanna join our Slack and we can discuss further? thank you.

@sir-sigurd
Copy link
Member

@marcodlk

With boto3>=1.28.0 you can use AWS_ENDPOINT_URL_S3 to customize endpoint URL.
See https://docs.aws.amazon.com/sdkref/latest/guide/feature-ss-endpoints.html.

@link89
Copy link

link89 commented Aug 31, 2023

Hi @marcodlk Can you share the diff of the change you make? It looks like quilt never access the credentials.json file.

@marcodlk
Copy link

marcodlk commented Sep 7, 2023

@link89 I no longer have access to the codebase I was working on, but looking at the code, quilt3.session._load_credentials still uses CREDENTIALS_PATH so that's odd. Are you sure it is the "credentials.json" in the Quilt app directory as specified by BASE_PATH in quilt3.util module? Have you tried @sir-sigurd 's solution?

@akarve
Copy link
Member

akarve commented Sep 8, 2023

For min.io support we hopefully don't need to touch credentials.json as that is for the special case where users authenticate to a Quilt stack. But in the more general case quilt3 just falls back onto the boto3 credential chain (and never touches credentials.json) and that is applicable in more cases, especially for pure open source users.

@kevinemoore
Copy link
Contributor

Here is a draft PR that allows users to create their own S3 clients (including min.io clients) and map them to specific buckets. #3765

We'd appreciate any feedback on the interface. This isn't necessarily the best way for Quilt to find and access min.io servers. Please let us know how you think Quilt should map min.io endpoints and bucket names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants