-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prune: limit concurrency based on file descriptors #3842
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✨ Looks good!
When there's a small number of file descriptors, we can exhaust them all during a prune, causing it to fail. This is common on macOS, which has a limit of 256 file descriptors by default. Use a heuristic to allocate FDs based on the limit, and on Windows, which does not support gerlimit(2), use a heuristic depending on the number of CPUs, as before.
if maxThreads < 1 { | ||
return 1 | ||
} | ||
return maxThreads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be min(maxThreads, runtime.NumCPU() * 2)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the old heuristic was to run twice as many goroutines as CPUs. We can spawn arbitrarily many goroutines and the Go scheduler will only run as many as there are threads to run them (which is usually the number of CPUs).
The only limiting factor that's required here is file descriptors; the actual CPU usage is handled by the scheduler. We retain the old check just to handle cases where we can't get a good value.
return int64(runtime.NumCPU() * 2) | ||
} | ||
// Assume 64 FDs for each operation and leave 64 for other purposes. | ||
maxThreads := (int64(fds) - 64) / 64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to set unlimited file descriptors? If yes, will this math behave properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is indeed possible to set the value to RLIM_INFINITY
. Assuming that's a positive and large number, that will work fine. If it's a negative number, it will limit it to 1 thread, which, while not ideal, will also work. I believe it's a positive and large number most places.
The reason this hasn't merged is that there's some evidence that we're not actually limiting the number of processes to the number of items mentioned in the pull request, and I need to look more into why that is. Adjusting the threshold isn't going to be effective if it isn't an effective control on the number of processes we spawn. |
When there's a small number of file descriptors, we can exhaust them all during a prune, causing it to fail. This is common on macOS, which has a limit of 256 file descriptors by default. Use a heuristic to allocate FDs based on the limit, and on Windows, which does not support gerlimit(2), use a heuristic depending on the number of CPUs, as before.
Fixes #3752.