Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add thread pool and concurrency max threads configuration option #472

Draft
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

MattFenelon
Copy link
Contributor

@MattFenelon MattFenelon commented Mar 27, 2024

This option allows to limit the maximum number of resources that can be
sideloaded concurrently. With a properly configured connection pool,
this ensures that the activerecord's connection pool is not exhausted by
the sideloading process.

The thread pool configuration is based on ActiveRecord's
global_thread_pool_async_query_executor.

This was previously attempted but there were reports of deadlocks.

TODO

  • Fix deadlock
  • Fix rails integration
  • Rails integration test
  • Close adapter

Closes #469

See: https://github.com/rails/rails/blob/94faed4dd4c915829afc2698e055a0265d9b5773/activerecord/lib/active_record.rb#L279-L287
See: #470

…handle on what's causing thread pool hangs. refs graphiti-api#469"

This reverts commit 7941b6f.
This option allows to limit the maximum number of resources that can be
sideloaded concurrently. With a properly configured connection pool,
this ensures that the activerecord's connection pool is not exhausted by
the sideloading process.

The thread pool configuration is based on ActiveRecord's
global_thread_pool_async_query_executor.

This was previously attempted but there were reports of deadlocks. This
code differs from the original by using Concurrency::Delay assigned to a
constant instead of a regular Mutex. The Delay+constant is how
concurrent-ruby sets up their global thread pools so it's more likely to
be correct.

Closes graphiti-api#469

See: https://github.com/rails/rails/blob/94faed4dd4c915829afc2698e055a0265d9b5773/activerecord/lib/active_record.rb#L279-L287
See: graphiti-api#470
@jkeen
Copy link
Collaborator

jkeen commented Mar 28, 2024

Still getting a lockup. This sounded possibly related: ruby-concurrency/concurrent-ruby#933

And it looks like graphiti only requires concurrency-ruby ~> 1.0, when really it should be > 1.2 to include that fix? However, I tried including that and trying it out and it didn't seem to help.

But one thing i realized: one request that's causing a hang has 13 sideloads… which is a lot, admittedly. But still, if there are more sideloads than there are threads, I think that might be a trouble spot?

@MattFenelon
Copy link
Contributor Author

Still getting a lockup. This sounded possibly related: ruby-concurrency/concurrent-ruby#933

And it looks like graphiti only requires concurrency-ruby ~> 1.0, when really it should be > 1.2 to include that fix? However, I tried including that and trying it out and it didn't seem to help.

But one thing i realized: one request that's causing a hang has 13 sideloads… which is a lot, admittedly. But still, if there are more sideloads than there are threads, I think that might be a trouble spot?

Thanks for testing that, @jkeen. Good find on that bug. Upgrading the gem is definitely something to consider but as you say, it doesn't seem to have solved the issue.

I'm surprised we're having issues as the code is very similar to that used by ActiveRecord for the new async queries. So perhaps there's something in the way that Graphiti is calling the thread pool that is causing problems. I suspect that's more the case then there being a bug in concurrent-ruby's thread pool.

13 sideloads seems a lot but the default thread pool has 4 threads and a queue of 4 * 4 that (25) and if the queue is expired it's meant to use the calling thread to run the request. I'm wondering if that the :caller_runs option might be where the problem lies.

Are all of the sideloads running ActiveRecord queries?

I think I might need to add some logging to debug the issue.

@jkeen
Copy link
Collaborator

jkeen commented Mar 28, 2024

All the sideloads run active record queries, and some are pulling nested relationships off the sideloads, as in includes=items,items.comments, etc.

I think figuring out a good logging solution would be great improvement—I dug around the ruby-concurrency docs briefly and couldn't make heads or tails of how one might log when a new thread is getting created or hung up. But if we had some good logs this issue might be much easier to diagnose.

As far as the other options for fallback, I haven't tried :abort or :drop. I would assume abort would make the request throw a 500, and drop would… return incomplete data?

This include some logging for diagnostics to show what's happening when
the deadlock occurs.

```
Graphiti::Scope
  #resolve_sideloads
    when the requested sideload exists on the resource
      with concurrency
        with nested sideloads greater than Graphiti.config.concurrency_max_threads
thread 6220: employees queuing positions
thread 6220: employees waiting on [:positions]
thread 6240: running positions
thread 6240: positions queuing department
thread 6240: positions waiting on [:department]
          does not deadlock (FAILED - 1)

Failures:

  1) Graphiti::Scope#resolve_sideloads when the requested sideload exists on the resource with concurrency with nested sideloads greater than Graphiti.config.concurrency_max_threads does not deadlock
     Failure/Error: expect { instance.resolve_sideloads(results) }.not_to raise_error

       expected no Exception, got #<fatal:"No live threads left. Deadlock?\n2 threads, 2 sleeps current:0x00007f7e6f7b1780 main thread:...or.rb:339 sleep_forever>\n   rb_thread_t:0x00007f7e6f7b1780 native:0x000070000cfb4000 int:0\n   \n"> with backtrace:
         # ./lib/graphiti/scope.rb:78:in `resolve_sideloads'
         # ./spec/scope_spec.rb:145:in `block (7 levels) in <top (required)>'
         # ./spec/scope_spec.rb:145:in `block (6 levels) in <top (required)>'
     # ./spec/scope_spec.rb:145:in `block (6 levels) in <top (required)>'
```
@MattFenelon
Copy link
Contributor Author

I've added a failing test case that shows the deadlock:

e4de93e

@jkeen
Copy link
Collaborator

jkeen commented Apr 12, 2024

Excellent!! Nice work on that 🥇

…s-configuration-option

# Conflicts:
#	lib/graphiti/scope.rb
Not sure if I want to keep this but adding for now.
Rails assigns a connection to a thread so the close call has to happen
on that same thread for the thread's connection to be closed.
@resource.resolve will return either a promise or a real value
I misunderstood what @resource.resolve would call, assuming that the
`scope.to_a` referred to the Scope class. It actually refers to the
base_scope which for AR will be an AR query.
Pass the arguments to the promise so it runs with the correct values.

I've also disabled the adapter close for now as I'm not sure how to
close it without breaking other promises that might run on the same
thread in the pool.
@MattFenelon
Copy link
Contributor Author

@jkeen do you know how to get the rails tests to run locally? I'm struggling to debug the failures there.

I've tried running:

$ bundle exec appraisal install

Which fails when trying to install the sqlite3 gem

An error occurred while installing sqlite3 (1.4.4), and Bundler cannot continue.

The installed sqlite3 is the default that comes with macOS

$ sqlite3 --version
3.32.2 2021-07-12 15:00:17 bcd014c473794b09f61fbc0f4d9488365b023f16123b278dbbd49948c27calt2

I've also tried using a later version.

@jkeen
Copy link
Collaborator

jkeen commented May 21, 2024

@MattFenelon Sqlite is notorious for throwing build errors and I feel like I've spent countless hours of my life troubleshooting system-specific compilation errors with sqlite with ruby and node projects, and with the time in that space I should be an expert, and yet every time it comes along I'm no better equipped than I was the last time.

Did you try installing the gem separately first? This has also helped me in the past: https://github.com/sparklemotion/sqlite3-ruby/blob/main/INSTALLATION.md

@MattFenelon
Copy link
Contributor Author

@MattFenelon Sqlite is notorious for throwing build errors and I feel like I've spent countless hours of my life troubleshooting system-specific compilation errors with sqlite with ruby and node projects, and with the time in that space I should be an expert, and yet every time it comes along I'm no better equipped than I was the last time.

Did you try installing the gem separately first? This has also helped me in the past: https://github.com/sparklemotion/sqlite3-ruby/blob/main/INSTALLATION.md

Sounds about right 😅

I've managed to work around it by upgrading to version 2 of the sqlite3 gem as that uses native binaries. I also had to upgrade rails to the main branch version (8) but that's got me to a point where I can debug the rails tests.

To debug the rails tests. I'll remove this once the tests are passing.
I'll remove this once the PR is ready
This is a slightly different implementation due to the use of promises
for both async and sync modes. The difference is whether the thread pool
is set to synchronous or not.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ActiveRecord::ConnectionTimeoutError "all pooled connections were in use" after enabling concurrency
2 participants