Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to share a single redis connection across forks within a single worker #1650

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

isnotajoke
Copy link

Resque will (unless I'm missing something obvious) intentionally disconnect & reconnect its redis connection across forks. This is safe, but results in a lot of connection churn, which can become an issue in environments with a lot of tasks. The redis-rb library exposes a configuration option that allows parent and child processes to share a single redis connection as long as they don't both attempt to use it at the same time. Resque seems to guard against this, so using the option should be safe. This PR enables this by:

  • Introducing a share_redis configuration option. If truthy, Resque will refrain from explicitly cycling its redis connection after a fork.
  • Adding that option to the README
  • Adding a test.

In local testing in our environment, this change works properly (there aren't any new errors in our jobs), and reduces our connection churn dramatically.

@oehlschl
Copy link

Thanks @isnotajoke! Came across this while also looking for Resque optimizations and it seems like it could be helpful for us as well.

I'm happy to pilot this branch for our jobs but want to make sure I'm not missing something super obvious. @rafaelfranca any thoughts as an expert?

Thanks again. I'll start playing around with this locally today.

Copy link
Contributor

@jeremy jeremy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this can and probably should be the default, rather than something to turn on only for high traffic deployments.

How about flipping the flag and calling it something like Resque.reconnect_redis_per_job?, following the fork_per_job? language? Defaulting to false.

Let's explicitly check that inherit_socket is true in this case, too. (Redis client will raise InheritedError, but it's not clear what to fix in Resque configuration.)

@pieterza
Copy link

pieterza commented Nov 5, 2018

+1

@pieterza
Copy link

pieterza commented Mar 25, 2019

Any update on this one @jeremy ? I'm happy to have a go at it unless you feel like responding to the changes requested @isnotajoke

@isnotajoke
Copy link
Author

(apologies for the radio silence; haven't had time to take another look at this)

The main reason I opted to not make it the default is that it's a change from previous behavior, and (without in depth knowledge of how Resque's releases are structured), I tend to prefer releasing new behavior as default off initially, changing it to on in subsequent releases if desired (and if there aren't significant bugs or other issues). Happy to defer to convention if we prefer default on here, though.

Good call on the inherit_socket check; will try to make that change this weekend.

@pieterza
Copy link

Any update on this one? :)

@isnotajoke isnotajoke force-pushed the fork-redis-disconnect branch 2 times, most recently from efb8e5d to 534368b Compare June 22, 2019 18:45
@jgao54
Copy link

jgao54 commented Jul 9, 2020

Very interested in this feature as well.

@iloveitaly
Copy link
Contributor

I'd love to have this feature merged. How can I help get it done?

README.markdown Outdated
If you have an environment with a lot of tasks or special resource constraints
around your Redis installation, you can configure Resque to share a redis
connection between pre- and post-fork processes. This removes some of the
safety provided by Resque's forking model, but can help moderate load and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isnotajoke could you expand more on:

This removes some of the safety provided by Resque's forking model

What risks are there? What things should the user look out for when using this option?

Also, do you know if this causes any issues with resque worker pools? I don't think detailing that out is a requirement for getting this merged, but if you happen to know, I think it would be great to add this in.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't observed any specific risks (or things to look out for).

Decoupling task executions from one another (by forking, by sharing relatively little between parent & child processes across the fork) makes Resque very tolerant of even very destructive failures within individual tasks; the individual task execution may fail, but the worker process will remain functional. Adding more shared state across forks (as this feature does when enabled) chips away at that safety. Here, I'd be worried about tasks failing or exiting in a way that leaves the underlying Redis connection in a weird state, potentially causing issues for the worker process as it looks for work & updates queue state in redis, as well as for individual job executions. We never saw that in our local testing, but it is a risk (and something to consider if turning this on for a mature system with a lot of existing Resque tasks).

I think the explanation above is a lot for a README, but I'll try to add a little bit more detail on the risk to this change.

lib/resque.rb Outdated
# fresh Redis connection. if false, Resque workers will instantiate a
# single Redis connection and share it across forks with individual workers.
# a false setting tends to be more efficient in terms of connection overhead,
# but may not work with all client libraries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which client libraries does it not work with? Maybe we could be more explicit here and mention we only recommend using this with redis-rb?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only tested with redis-rb. Agree that calling that out ("we're confident it works for redis-rb; you should test thoroughly with other libraries") is a good move.

Copy link
Contributor

@iloveitaly iloveitaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple comments on documenting operational usage, but code-wise this PR looks great.

@isnotajoke
Copy link
Author

@iloveitaly thanks for the feedback. I'll make some updates based on your suggestions.

While we did briefly test this in our infrastructure, we ultimately addressed the underlying concern (connection churn) at an infrastructure level rather than with this patch. I still think the idea makes sense, but I'd be more confident in this PR if someone could vouch for it working in their environment. (I can help with some basic sanity checking in my environment, but can't commit to a specific date for having that done).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants