-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set CPU scheduling affinity #995
base: master
Are you sure you want to change the base?
Conversation
/azp run libertem.libertem-data |
Azure Pipelines successfully started running 1 pipeline(s). |
Codecov Report
@@ Coverage Diff @@
## master #995 +/- ##
==========================================
+ Coverage 68.83% 69.90% +1.07%
==========================================
Files 260 260
Lines 11950 12392 +442
Branches 1640 1765 +125
==========================================
+ Hits 8226 8663 +437
- Misses 3408 3411 +3
- Partials 316 318 +2
Continue to review full report at Codecov.
|
Test failure is a timeout, maybe #996 should be merged before this one. |
/azp run libertem.libertem-data |
Azure Pipelines successfully started running 1 pipeline(s). |
Could it be possible that LiberTEM sees cores in CI where it can actually not run anything? For example, if a Docker container is limited to specific cores https://thorsten-hans.com/docker-container-cpu-limits-explained#assign-containers-to-dedicated-cpus, would it still "see" the other CPUs and then try to pin workers to these, where they will never run? I'll close and open again to re-run, to see if this was a glitch. |
Uhh that could be possible. Maybe we can't blindly use the "CPU number" from 0 to N from the cluster spec for the affinity, it's possible we have to build up a list of "eligible" core IDs, which we index into. We can also leave this PR open for a while until we find a solution, as it's not high priority right now. |
At least with the docker method described in the article,
|
Discussion with @sk1p: Use cases for pinning:
|
A MVP could be opt-out for pinning, i.e. default pinning or no pinning. Second step would be customizable pinning (proposal by @sk1p) :
Third step would be pinning setup for CPUs + multiple CUDA devices. Since that is pretty complex, perhaps one could develop a routine that can make a proposal for a pinning scheme, for example by measuring transfer rates or latencies between specific cores and GPUs? |
Contributor Checklist:
Reviewer Checklist:
/azp run libertem.libertem-data
passed