Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow config to define thread count #7442

Merged
merged 6 commits into from Jan 20, 2022
Merged

Conversation

M1ke
Copy link
Contributor

@M1ke M1ke commented Jan 20, 2022

For large projects many systems will not handle multithreading correctly. For example if run in a container, the underlying machine may report 32 threads, but the container has access to 2. Or in large projects where there's a memory limit (e.g. our codebase takes 7GB) running with more than 1 thread, even if possible, consumes all available memory.

Whilst threads can be set on the command line, this isn't always possible. My specific example is the way that PhpStorm runs Psalm, but I imagine there are other tools which may run Psalm off a default config or with their own flags, where a user cannot configure what flags are set. Generally it's worth most flags being settable in a config file anyway, so that behaviour can be locked down - for example in our project for local developers, we have added a set of bash scripts to run Psalm with various flags, and all of them force threads=1.

This PR adds "threads" as a config option. If set, and if lower than the detected threads, it will be applied. If unset or higher than detected threads, existing behaviour is applied. Thus this cannot disrupt existing mechanisms of setting threads to 1.

I've been unable to run tests locally due to a weird bug with composer that's failing to unzip the symyony/filesystem plugin, so I will watch the CI build and make adjustments if required.

@orklah
Copy link
Collaborator

orklah commented Jan 20, 2022

Cool! This is a great addition!

if lower than the detected threads, it will be applied

I'm not a fan of that behaviour. I'd prefer if we could use the config instead of the detection and then override that with special cases for --debug or other hardcoded rules

@M1ke
Copy link
Contributor Author

M1ke commented Jan 20, 2022

I'd prefer if we could use the config instead of the detection

My concern about that is that the config can then produce unexpected results. I.e. I run on my CI infrastructure usually which has 16 cores, so I configure it to use 14 of them. However one time I try to run locally and I'm honestly not sure what happens if you try and run 14 threads on a machine that has 4 cores. So then you're having to either run in debug mode (limited to 1 thread when maybe you'd prefer to use 2) or having to edit the config file (and remembering not to commit your change)

I guess we could run it in order of:

  • Things that force to 1 (debug, CI)
  • Command line
  • Config file
  • Detection

However I think we'd still want detection to be used if the config is set, to make sure it doesn't use too many threads. Which is basically back to the implementation I've done here.

@M1ke
Copy link
Contributor Author

M1ke commented Jan 20, 2022

I can see that there's one failing job but not sure how to resolve; I can see the labels section on the PR sidebar but not clear how I add one. Is that something only maintainers can do?

What do you think about how we should handle the order of precedence here?

@orklah
Copy link
Collaborator

orklah commented Jan 20, 2022

As far as I know, the only risk of running on too many threads is loosing performances.

Either because there is no sense trying to run two threads on the same core or because each thread need some memory and that could sometimes use all the memory the machine has (and then it's either killed by the OS or stopped by php's memory limit)

On the other hand, we know that the detection in CI is limited so it's only a question of time before we have a user that want to fix the detection in the other way...

The last job can only fixed by us, don't worry about it ;)

@orklah orklah added the release:feature The PR will be included in 'Features' section of the release notes label Jan 20, 2022
@M1ke
Copy link
Contributor Author

M1ke commented Jan 20, 2022

Cool, so if I modify to this order:

* Things that force to 1 (debug, CI)
* Command line
* Config file
* Detection

Would you be happy to merge?

@M1ke
Copy link
Contributor Author

M1ke commented Jan 20, 2022

Actually I realise that exact order would be a breaking change because currently the command line takes precedence, so I'll do:

  • Command line
  • Things that force to 1 (debug, CI)
  • Config file
  • Detection

@AndrolGenhald
Copy link
Collaborator

AndrolGenhald commented Jan 20, 2022

Actually I realise that exact order would be a breaking change because currently the command line takes precedence, so I'll do:

  • Command line
  • Things that force to 1 (debug, CI)
  • Config file
  • Detection

Just to give my 2 cents, this is the order I would expect, and I wouldn't expect a command line specifying 50 threads to be limited to 8 because that's what Psalm detects. I can imagine a case where Psalm's runtime is bounded by some sort of latency more than by CPU time. Perhaps someone has their code on a network filesystem with unfortunately high latency, it's possible running Psalm with more threads than cores would improve runtime depending on the file access pattern.

I wouldn't expect CI to force 1 thread, but if that's the way it works now I guess I'm fine leaving it.

@M1ke
Copy link
Contributor Author

M1ke commented Jan 20, 2022

Right, functionality updated per the suggestions, and all green. Need anything else from me or lgtm?

@orklah orklah merged commit 2aeaade into vimeo:master Jan 20, 2022
@orklah
Copy link
Collaborator

orklah commented Jan 20, 2022

Thanks! Congrats on your first PR!

@M1ke
Copy link
Contributor Author

M1ke commented Jan 20, 2022 via email

@M1ke M1ke deleted the m1ke/config-threads branch January 26, 2022 13:06
@M1ke
Copy link
Contributor Author

M1ke commented Jan 31, 2022

@orklah is there a way to know when this makes it into a release? I can see that 4.19 covers commits before and after the date this was merged, but it doesn't seem to have been included. Or have I got my merge target wrong for this?

@orklah
Copy link
Collaborator

orklah commented Jan 31, 2022

@M1ke your PR targeted the master branch, who is our development branch for the next major version of Psalm (so Psalm 5).

We should have released a first alpha a few days ago but we're kinda late on that. The real version will be released in a few weeks

Your PR didn't contain BC break so it could theoretically have targeted 4.x...

@M1ke
Copy link
Contributor Author

M1ke commented Jan 31, 2022

@orklah ok thanks; if I were to submit the same commits as a PR onto 4.x would you be happy to merge?

@orklah
Copy link
Collaborator

orklah commented Jan 31, 2022

@M1ke I have no objection, but I also have very little experience with git :p

@weirdan what is the standard procedure in this case?

@M1ke
Copy link
Contributor Author

M1ke commented Jan 31, 2022

I can attempt a rebase - depends what's changed in the underlying code but I don't think I've touched much that seems to be modified heavily in 5.x

If I can't rebase, I'll write the same content onto 4.x and submit that as a PR. Once merged I can open a PR from 4.x to master which will resolve potential merge conflicts.

@weirdan
Copy link
Collaborator

weirdan commented Jan 31, 2022

what is the standard procedure in this case?

Rebase and another PR should do the trick, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release:feature The PR will be included in 'Features' section of the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants