Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Improve performance of yarn run #2575

Open
1 of 2 tasks
SimenB opened this issue Mar 7, 2021 · 12 comments
Open
1 of 2 tasks

[Feature] Improve performance of yarn run #2575

SimenB opened this issue Mar 7, 2021 · 12 comments
Labels
enhancement New feature or request

Comments

@SimenB
Copy link

SimenB commented Mar 7, 2021

  • I'd be willing to implement this feature (contributing guide)
  • This feature is important to have in this repository; a contrib plugin wouldn't do

Describe the user story

Running yarn jest has a significant performance overhead in Yarn v2 vs Yarn v1 (or npx). The following commands are run in a repo installed using yarn v2, with the node-linker. Yarn v1 is run from the brew installed version, no local version in the repo. Note that all runs are without running yarn or npm in between - they all run against a repo installed using Yarn v2

$ hyperfine -w 5 'yarn jest --version' 'npx jest --version' 'npm exec jest -- --version'
Benchmark #1: yarn jest --version
  Time (mean ± σ):      1.298 s ±  0.031 s    [User: 1.597 s, System: 0.149 s]
  Range (min … max):    1.270 s …  1.381 s    10 runs

Benchmark #2: npx jest --version
  Time (mean ± σ):     469.4 ms ±   9.7 ms    [User: 397.0 ms, System: 91.6 ms]
  Range (min … max):   451.8 ms … 484.1 ms    10 runs

Benchmark #3: npm exec jest -- --version
  Time (mean ± σ):     466.9 ms ±  14.8 ms    [User: 396.8 ms, System: 90.6 ms]
  Range (min … max):   446.9 ms … 490.0 ms    10 runs

Summary
  'npm exec jest -- --version' ran
    1.01 ± 0.04 times faster than 'npx jest --version'
    2.78 ± 0.11 times faster than 'yarn jest --version'

Then for yarn v1 (by doing rm .yarnrc*)

$ hyperfine -w 5 'yarn jest --version' 'npx jest --version' 'npm exec jest -- --version'
Benchmark #1: yarn jest --version
  Time (mean ± σ):     376.0 ms ±  22.0 ms    [User: 265.8 ms, System: 62.9 ms]
  Range (min … max):   352.1 ms … 423.0 ms    10 runs

Benchmark #2: npx jest --version
  Time (mean ± σ):     480.6 ms ±  52.4 ms    [User: 403.2 ms, System: 93.1 ms]
  Range (min … max):   408.4 ms … 582.2 ms    10 runs

Benchmark #3: npm exec jest -- --version
  Time (mean ± σ):     491.8 ms ±  25.8 ms    [User: 412.4 ms, System: 99.7 ms]
  Range (min … max):   451.5 ms … 535.4 ms    10 runs

Summary
  'yarn jest --version' ran
    1.28 ± 0.16 times faster than 'npx jest --version'
    1.31 ± 0.10 times faster than 'npm exec jest -- --version'

While it doesn't matter too much in the case of Jest (although perceived startup performance is noticeably worse when running tests since there's no output), it's quite annoying when running tools as git hooks.

$ hyperfine -w 5 'yarn lint-staged' 'npx lint-staged' 'npm exec lint-staged'
Benchmark #1: yarn lint-staged
  Time (mean ± σ):      1.473 s ±  0.069 s    [User: 1.780 s, System: 0.192 s]
  Range (min … max):    1.367 s …  1.562 s    10 runs

Benchmark #2: npx lint-staged
  Time (mean ± σ):     540.3 ms ±  32.6 ms    [User: 468.0 ms, System: 114.4 ms]
  Range (min … max):   493.2 ms … 579.3 ms    10 runs

Benchmark #3: npm exec lint-staged
  Time (mean ± σ):     589.5 ms ±  59.5 ms    [User: 509.0 ms, System: 124.3 ms]
  Range (min … max):   540.7 ms … 692.1 ms    10 runs

Summary
  'npx lint-staged' ran
    1.09 ± 0.13 times faster than 'npm exec lint-staged'
    2.73 ± 0.21 times faster than 'yarn lint-staged'

And with yarn v1

$ hyperfine -w 5 'yarn lint-staged' 'npx lint-staged' 'npm exec lint-staged'
Benchmark #1: yarn lint-staged
  Time (mean ± σ):     545.4 ms ±  72.8 ms    [User: 411.8 ms, System: 104.0 ms]
  Range (min … max):   465.7 ms … 695.8 ms    10 runs

Benchmark #2: npx lint-staged
  Time (mean ± σ):     583.3 ms ±  31.2 ms    [User: 503.4 ms, System: 123.1 ms]
  Range (min … max):   543.0 ms … 631.4 ms    10 runs

Benchmark #3: npm exec lint-staged
  Time (mean ± σ):     574.3 ms ±  14.1 ms    [User: 497.9 ms, System: 123.1 ms]
  Range (min … max):   556.7 ms … 596.9 ms    10 runs

Summary
  'yarn lint-staged' ran
    1.05 ± 0.14 times faster than 'npm exec lint-staged'
    1.07 ± 0.15 times faster than 'npx lint-staged'

When rebasing lots of commits, that almost 1.5s of overhead (or 1s in the case of yarn v1) adds a lot of time spent. It's gotten to the point where I pass -n if I'm making lots of commits (or rebase) as it takes way too long to run.

For reference, I've been running these benchmarking runs in https://github.com/jest-community/eslint-plugin-jest

Describe the solution you'd like

I'd like it to be faster 😀

I expect some overhead of yarn v2 simply due to the fact it needs to first spawn yarn v1, find the config, then load yarn v2. But almost 3x time spent in execution is way more than I expected.

I assume you've already tried to optimize the overhead added by yarn when running scripts, but maybe some checks for "PnP compliance" or whether the lockfile is in sync with all package.jsons can be dropped for the node linker? Possibly via some flag which we can then use when we want, although a flag saying "just do it, don't check" might not be feasible? And a flag saying "give me perf" is probably weird and few people will know about it

Describe the drawbacks of your solution

I don't think there's any drawbacks to improving performance, but I don't know enough about Yarn's innards to comment on technical drawbacks of whatever optimization is applied.

Describe alternatives you've considered

Stop using yarn as binary runner and use npx or npm exec instead, at least in commit hooks and such.

@SimenB SimenB added the enhancement New feature or request label Mar 7, 2021
@arcanis
Copy link
Member

arcanis commented Mar 7, 2021

Timing are always funny 😄 we got a very similar two days ago, and some improvements have landed: #2560

However my fix should mostly have an effect on the PnP linker, so perhaps there's something similar that needs to be done for the nm one? We should get a CPU stack sample to have a better idea.

@SimenB
Copy link
Author

SimenB commented Mar 7, 2021

Hah, perfect 😀

No real difference running from sources, tho.

$ yarn set version from sources && yarn
$ yarn --version
2.4.0-git.20210306.hash-fea486ce
$ hyperfine -w 5 'yarn lint-staged' 'npx lint-staged' 'npm exec lint-staged'
Benchmark #1: yarn lint-staged
  Time (mean ± σ):      1.412 s ±  0.063 s    [User: 1.705 s, System: 0.200 s]
  Range (min … max):    1.353 s …  1.526 s    10 runs

Benchmark #2: npx lint-staged
  Time (mean ± σ):     627.2 ms ± 122.9 ms    [User: 520.5 ms, System: 133.5 ms]
  Range (min … max):   561.3 ms … 969.8 ms    10 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark #3: npm exec lint-staged
  Time (mean ± σ):     549.7 ms ±  29.4 ms    [User: 476.2 ms, System: 120.0 ms]
  Range (min … max):   518.9 ms … 613.5 ms    10 runs

Summary
  'npm exec lint-staged' ran
    1.14 ± 0.23 times faster than 'npx lint-staged'
    2.57 ± 0.18 times faster than 'yarn lint-staged'

@SimenB
Copy link
Author

SimenB commented Mar 8, 2021

Fresh run after the 2 PRs (same sequence of commands as above)

Benchmark #1: yarn lint-staged
  Time (mean ± σ):     995.2 ms ±  30.3 ms    [User: 1.054 s, System: 0.150 s]
  Range (min … max):   966.7 ms … 1070.2 ms    10 runs

Benchmark #2: npx lint-staged
  Time (mean ± σ):     525.0 ms ±  15.9 ms    [User: 458.8 ms, System: 108.9 ms]
  Range (min … max):   512.2 ms … 557.1 ms    10 runs

Benchmark #3: npm exec lint-staged
  Time (mean ± σ):     537.8 ms ±  31.8 ms    [User: 466.9 ms, System: 111.7 ms]
  Range (min … max):   508.3 ms … 605.8 ms    10 runs

Summary
  'npx lint-staged' ran
    1.02 ± 0.07 times faster than 'npm exec lint-staged'
    1.90 ± 0.08 times faster than 'yarn lint-staged'

so it has definitely improved! 👏 1.9x of npx and npm exec now instead of the 2.6-2.8 I saw before

@andreialecu
Copy link
Contributor

andreialecu commented Mar 11, 2021

Somewhat related discussion: #2117

A big part of the startup overhead is node having to parse the entire yarn bundle and launch the WebAssembly libzip implementation.

If nodejs/node#36812 would ever be implemented in node, performance in such cases should hugely improve. Yarn could then be distributed as separate .js files (within a packaged application), and only a subset of it would need to be parsed for common commands, such as yarn run

@merceyz merceyz changed the title [Feature] Improve performance of running local binaries [Feature] Improve performance of yarn run Oct 12, 2021
@pastelsky
Copy link

pastelsky commented Oct 12, 2021

We hit this too, unfortunately with ~ 25 packages, the overhead of running workspace foreach is often 15-20 seconds just for an echo. This overhead sometimes exceeds the time needed for the command being run itself, and if you're running multiple foreachs, this quickly adds up. For large projects, this becomes prohibitively expensive.

We worked around this by switching runs to a different tool like — https://www.npmjs.com/package/workspaces-run
which isn't great because it comes with its own set of syntax.

@TrySound
Copy link

We run node_modules/.bin directly to workaround this.

@wdfinch
Copy link

wdfinch commented Nov 2, 2021

This is so bad for me that if I'm running a large amount of tests with yarn run it will crash most of the apps running on my computer.

@sotnikov-link

This comment was marked as duplicate.

@arcanis
Copy link
Member

arcanis commented Aug 18, 2022

@sotnikov-link I marked your comment as duplicate because it was taking a bunch of vertical space for things we already know.

Generally, comparing shell performances w/ yarn run performances isn't pertinent: yarn run has to go through node (which takes a couple hundred ms just spawning the script), resolve yarnPath (which requires I/O + spawn an additional Node process), retrieve the project workspaces (which require some I/O), find their dependencies, and setup a temporary environment to put in your $PATH. All this is costly, so it'll always have an overhead over direct shell commands.

Still, we really want to improve the yarn run performances to put them closer from npx, so eventually I think it'll get better, but at the moment we're not aware of any easy low-hanging fruit.

@sotnikov-link
Copy link

@arcanis I wanted to show people different between yarn run and shell with simple script without any dependencies.

Now, I tried turbo and it works for me faster than yarn workspace x run y. Maybe someone will find this solution useful.

package.json

turbo.json

{
  "$schema": "https://turborepo.org/schema.json",
  "pipeline": {
    "clean": {
      "inputs": ["dist/**", ".next/**", "storybook-static/**"]
    },
    "prepare": {
      "dependsOn": ["^prepare"],
      "inputs": ["**/*.tsx"],
      "outputs": ["dist/**", ".next/**", "storybook-static/**"]
    },
    "start": {
      "dependsOn": ["prepare"]
    }
  }
}

@me4502
Copy link
Contributor

me4502 commented Apr 20, 2023

A use case I've run into this issue pretty substantially is scripts running nested scripts within a large monorepo. The root package.json has a bunch of scripts that delegate to various workspaces or other helper scripts, sometimes leading to 3-4 yarn <script> calls before it reaches the actual binary destination. There are also a few cases where one script calls multiple script calls, eg yarn <script> && yarn <script>. This compounds the time taken as each call here spawns an entire new instance of node, parses yarn, sets up the environment, etc.

Would one partial possible solution/workaround to scripts running scripts be to reuse the existing yarn process if it can tell it's just running another script? Like "if script starts with yarn, and the next argument matches a script in this file, run that script". It doesn't solve all of these cases but does resolve some. I'm not familiar enough with yarn to know if this is possible, but it'd potentially cut down on some overhead.

From a short test, even a single nested call is leading to an echo "test" call taking 1.68 seconds.

time yarn test-a takes 1.68 seconds on my machine with the following script setup in a large workspace setup, despite not even crossing between workspaces.

"scripts": {
        "test-a": "yarn test-b",
        "test-b": "echo \"test\""
}

Edit: seems there's actually an issue tracking this (#3732)

@arcanis
Copy link
Member

arcanis commented Mar 27, 2024

I optimized yarn run a bit by stripping useless module evaluations; it's a tiny bit better, but not amazing either: #6188

If someone wants to help improve that further!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

8 participants