Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster_async: Wrap internal state with Arc. #864

Merged
merged 4 commits into from
Jun 21, 2023

Conversation

nihohit
Copy link
Contributor

@nihohit nihohit commented Jun 13, 2023

This change allows us to send the internal state of the client to async functions without the self reference imposing a lifetime.

@nihohit
Copy link
Contributor Author

nihohit commented Jun 15, 2023

@jaymell
Copy link
Contributor

jaymell commented Jun 16, 2023

@jaymell looks like we have some more flakyness - 2 test suites run for hours. https://github.com/redis-rs/redis-rs/actions/runs/5279416076/jobs/9550056369?pr=864 https://github.com/redis-rs/redis-rs/actions/runs/5279416076/jobs/9550056536?pr=864

Kicked off the jobs again. Have we seen this issue outside of this PR? I don't recall.

@nihohit
Copy link
Contributor Author

nihohit commented Jun 16, 2023

@jaymell I'm not sure, I think I might've noticed this in the past.

  1. one action seems to be stuck again.
  2. The actions view doesn't show any cancelled/timed out action, including the two I linked to, so there might be more which are simply not available in the view.

@nihohit
Copy link
Contributor Author

nihohit commented Jun 16, 2023

But both runs seem to have been stuck on test_async_cluster_basic_failover, so this branch might've added a deadlock there. I'll investigate.

@nihohit
Copy link
Contributor Author

nihohit commented Jun 16, 2023

I believe I found &solved the deadlock.

@nihohit nihohit force-pushed the split-cluster branch 2 times, most recently from a8d1d90 to 1b3b366 Compare June 18, 2023 07:21
@jaymell
Copy link
Contributor

jaymell commented Jun 19, 2023

Those deadlocks in the prior test failures have me a bit nervous. I think these latest changes look good, but I'm no longer confident I understand the risks of having an RwLock or Mutex in our implementation here.

Also, what was your reasoning for switching to RwLock?

@nihohit
Copy link
Contributor Author

nihohit commented Jun 20, 2023

I'm no longer confident I understand the risks of having an RwLock or Mutex in our implementation here.

The risks of using a Mutex / RwLock are always that there might be a deadlock possible. On the other hand, the risks of not doing this are that the async cluster client won't behave correctly, or will implement fan-out commands serially instead of concurrently. IMO, since the test suite managed to find several issues, I'm feeling pretty confident about this.

what was your reasoning for switching to RwLock?

The ability to hold multiple read locks during try_request. Although the locks are held very briefly each time, I would still prefer not to block concurrent requests at all.

@nihohit
Copy link
Contributor Author

nihohit commented Jun 20, 2023

rebased @jaymell

@jaymell
Copy link
Contributor

jaymell commented Jun 21, 2023

Sorry, rebase is again required, but let's get this in!

This change allows us to send the internal state of the client to async
functions without the `self` reference imposing a lifetime.
This solves a timing issue, where after calling `refresh_slots` there
are old  `InnerCore` objects with connection maps that are emptied or
no longer relevant. Now the map is updated under a mutex, and no new
`InnerCore` instanaces are created.
@nihohit
Copy link
Contributor Author

nihohit commented Jun 21, 2023

@jaymell rebased

@jaymell jaymell merged commit 76b6993 into redis-rs:main Jun 21, 2023
9 checks passed
@nihohit nihohit deleted the split-cluster branch June 21, 2023 15:09
altanozlu pushed a commit to altanozlu/redis-rs that referenced this pull request Aug 16, 2023
This change allows us to send the internal state of the client to async
functions without the `self` reference imposing a lifetime.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants