Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection timeout between master and worker is too high #415

Open
jjsaunier opened this issue Aug 11, 2018 · 6 comments
Open

Connection timeout between master and worker is too high #415

jjsaunier opened this issue Aug 11, 2018 · 6 comments

Comments

@jjsaunier
Copy link
Contributor

jjsaunier commented Aug 11, 2018

Currently the timeout between master and worker is hardcoded to 10s, for this kind of "internal connection" is too high, we dont want to wait 10s to know if a worker is available or "crashed" and then go to the next, 300ms is better. Faster it fails, faster it recover

What do you think ?

@mathieudz
Copy link
Contributor

I agree. In general php pm should fail faster in order to be able to recover at all in some circumstances. In 10 seconds a lot can go wrong.

@andig
Copy link
Contributor

andig commented Aug 11, 2018

It should be long enough to cater for startup time of a worker which might be slow?

@jjsaunier
Copy link
Contributor Author

Here is the timeout connection only, tcp handshake is not so long, and do not rely on application running inside the worker. Or if it's the case, when the worker is starting, it should not be ready

@andig andig added the core label Aug 24, 2018
@acasademont
Copy link
Contributor

@Prophet777 what would be the ideal timeout?

@jjsaunier
Copy link
Contributor Author

IDK if there is an ideal timeout, depends of the usage. I think 300ms as default is correct, since it's only used in internal between worker and master, for most of usage It would fit. And making it configurable is the ideal. Also 300ms is enough for tcp connection (local ~25ms RTT), so if this deadline is exceeded, something is wrong.

To be honest I dont really know what could be a good default timeout, the mindset arround is "more faster it fail, more faster we try to another one" and more faster we deliver the response to the client. So sound ok to me. Also if cascade failure happen, like 3 workers down in a row, would be like at least 900ms before getting response. Sounds resonnable for a degraded app.

@visitek
Copy link
Contributor

visitek commented Aug 12, 2021

The ideal timeout would be 10s by default, but it MUST be configurable. We need to set the value higher.

I don't agree with the hard-coded value, it is not scalable then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants