-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incremented connection delay are not of the stated duration #2519
Comments
Hello, and sorry for the delay. Despite the various details, I'm lacking a https://stackoverflow.com/help/minimal-reproducible-example to properly analyse your problem. However, I think the explanation below may be enough. Timeouts are the maximum amount of time to perform an operation, but if the connection outright fails, because Elasticsearch hasn't started listening to its port, then the connection will fail instantly. And (crucially) we don't currently sleep on retries, so you can easily get 10 successive connection attempts that all fail, the whole attempt finishing after 100ms. Now, in the second case, my understanding from your StackOverflow post is that you used So my understanding is that if we added proper exponential back-off (with sleep) between connection retries, then your issue would be solved. In the meantime, Does that sound right? |
Closing as I've not heard back from you. |
Hello @pquentin thanks for getting back and apologies for not responding (was AFK).
|
Elasticsearch Version
8.11.0
Installed Plugins
No response
Java Version
OS Version
Problem Description
The incremented connection delay described in the logs are not of the stated duration. Something like 1, 2, 4, 8, 16, 30, 30, 30, 30 that is something like a retry logic with an exponential backoff. Instead the connection aborts in a second or so. This happens inside just run docker containers (where the elasticsearch container is not fully up and running) based on official elasticsearch and python images using the py-elasticsearch client v. 8.12.1.
The behavior changed when I added an alternative delay in python, see https://stackoverflow.com/questions/78261325/why-does-flask-elasticsearch-timeout-duration-differ-between-docker-pause-and-do
To be sure, as soon as the containerized elasticsearch is completely up and running, the flask container is able to connect very quickly.
Steps to Reproduce
I followed steps described in these two tutorials:
except
localhost
but tohost.docker.internal
.Logs (if relevant)
where delay duration > 0.01s:
where delay duration ~ 10s:
The text was updated successfully, but these errors were encountered: