Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeout to Ceph GET API calls #900

Open
karthik-us opened this issue Jul 10, 2023 · 2 comments
Open

Add timeout to Ceph GET API calls #900

karthik-us opened this issue Jul 10, 2023 · 2 comments

Comments

@karthik-us
Copy link

This is to add neccessary changes in go-ceph to handle the ceph-csi issue #ceph/ceph-csi#3657.

Provide a way to configure the timeout for the ceph Get API calls to avoid command stuck if there is some problem between the ceph cluster and the csi driver (cluster health, slow ops, or short network connectivity problem)

For more info please refer to the ceph-csi issue.

@phlogistonjohn
Copy link
Collaborator

Can you be more specific about what APIs you mean? When I read "Get API calls" I think RGW (HTTP) APIs, but when I look at the linked issue it doesn't seem to be RGW specific.

The APIs that wrap C calls from Ceph do not support things like Go's contexts so the typical methods for timing out in Go do not work. There are some timeout related parameters in the ceph configuration that you could apply to a rados connection. You'd probably need to experiment with them to see what works for your use-case (if any).

@karthik-us
Copy link
Author

karthik-us commented Jul 13, 2023

Hi @phlogistonjohn, the problem that we are trying to solve is csi pod hang when there is something wrong in the ceph cluster or some network problems. In such cases pod restart is the only manual fix available at the moment. So we are trying to add timeouts to such csi calls (mainly the get calls). So if it is possible to do that directly on rados that would be great. Or else we might need to write wrappers around the get calls to handle it. Some more context on this can be found here (a bit old though).

Thanks for your inputs on the timeout related parameters in ceph configs. Let me check whether those can be useful here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants