Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up ESLint #16962

Closed
mdjermanovic opened this issue Mar 5, 2023 · 11 comments
Closed

Speed up ESLint #16962

mdjermanovic opened this issue Mar 5, 2023 · 11 comments
Labels
accepted There is consensus among the team that this change meets the criteria for inclusion archived due to age This issue has been archived; please open a new issue for any further discussion core Relates to ESLint's core APIs and features

Comments

@mdjermanovic
Copy link
Member

This is an issue to define tasks to improve ESLint performance per the recommendations from the "Speeding up the JavaScript ecosystem - eslint" blog post:

https://marvinh.dev/blog/speeding-up-javascript-ecosystem-part-3/

I was able to extract five recommendations from the blog post that relate to eslint core or eslint dependencies. Please add more if I missed something.

  1. Token store's utils.search should use a binary search algorithm.

    We could implement our own, or find a library. In fact, this used to be a binary search before Chore: Remove lodash #14287. The performance impact of switching to Array#findIndex was discussed in Chore: Remove lodash #14287 (comment), but at the time performance tests did not show significant differences. Regardless, I think we should reintroduce binary search here.

  2. Refactor the code to avoid calling the mentioned utils.search / instantiating BackwardTokenCommentCursor millions of times.

    This suggestion requires further analysis. I'm not sure if the premise that we can avoid this because "we should know exactly where we are" applies here because BackwardTokenCommentCursor is used by methods that take an arbitrary node/token, such as SourceCode#getTokensBefore.

  3. Several points on improving esquery performance.

    This has already been implemented in Optimize hot code paths estools/esquery#134. Our Single File and Multi Files performance tests show ~8% overall performance improvement. 🚀

  4. Fast path for simple selectors ("Bailing out early" section in the blog post).

    The suggestion is to handle the simplest selectors in the form of "NodeType" manually, without using esquery. We could definitely give this a try.

  5. "Rethinking selectors" section in the blog post.

    I'm not sure what is the recommendation here, is it to drop declarative selectors in favor of js functions that would be provided by rules?

@mdjermanovic mdjermanovic added the core Relates to ESLint's core APIs and features label Mar 5, 2023
@kecrily
Copy link
Member

kecrily commented Mar 6, 2023

I would like to invite the author of the article to participate in the discussion, ping @marvinhagemeister

@marvinhagemeister
Copy link

Thanks for the ping. Most of the improvements relating to esquery can indeed be found in the linked PR estools/esquery#134 .

  1. Switching to a binary search is an easy win, albeit a small one.
  2. The problem with BackwardTokenCommentCursor is that it always starts the search from the beginning if I recall correctly, similar to utils.search. Given that many files are composed of many tokens, that leads to a lot of misses
  3. Yup, most things were included in that PR
  4. I think adding a special path here for NodeType selectors is still worth it. Those were the most common ones I encountered
  5. Main recommendation there is to drop esquery in favor of JS functions in that section of the blog post. It's arguably a strong breaking change, so not sure how feasible that is.

@nzakas
Copy link
Member

nzakas commented Mar 7, 2023

Here's what I think the task list is, so we can just check these off as we go:

  • Update TokenStore utils.search() to use a binary search algorithm
  • Try to avoid BackwardTokenCommentCursor and utils.search()
  • Bail out early when matching simple node types instead of using esquery in NodeEventGenerator

The last recommendation, to drop selector syntax, would be a significant breaking change and not one we could introduce any time soon. I think a better approach is likely to investigate creating a tool that can take a rule that uses query strings and regenerate it so that it doesn't use query strings...or maybe something simpler like a tool that you can pass a bunch of query strings to and it will generate a rule scaffold for you. In any event, I think going the way of creating a tool to generate more-performant JS code instead of having selectors in the final JS would be a much more palatable choice.

@mdjermanovic mdjermanovic added the accepted There is consensus among the team that this change meets the criteria for inclusion label Mar 8, 2023
nzakas added a commit that referenced this issue Mar 24, 2023
@nzakas
Copy link
Member

nzakas commented Mar 24, 2023

I took a stab at the fast path for query selectors and didn't see any significant performance improvement, either in our standard perf test or just running ESLint on our own codebase.
#17019

@sam3k
Copy link
Contributor

sam3k commented Mar 29, 2023

Thanks for the ping. Most of the improvements relating to esquery can indeed be found in the linked PR estools/esquery#134 .

  1. Switching to a binary search is an easy win, albeit a small one.
  2. The problem with BackwardTokenCommentCursor is that it always starts the search from the beginning if I recall correctly, similar to utils.search. Given that many files are composed of many tokens, that leads to a lot of misses
  3. Yup, most things were included in that PR
  4. I think adding a special path here for NodeType selectors is still worth it. Those were the most common ones I encountered
  5. Main recommendation there is to drop esquery in favor of JS functions in that section of the blog post. It's arguably a strong breaking change, so not sure how feasible that is.

These are some great performance improvements:

We microbenchmarked the effect of each of these changes, but to get a better of their real life impact I tested the eslint-plugin-unicorn ESLint plugin on the codebase at my workplace. The plugin relies heavily on selectors. Linting times were as follows:

Before enabling eslint-plugin-unicorn: 14s (TypeScript codebase with type-aware linting rules enabled, and esquery optimizations didn't have any real impact here)

After enabling eslint-plugin-unicorn's recommended config without these optimizations: 23s

After enabling eslint-plugin-unicorn's recommended config with these optimizations: 18s

Cumulatively these optimizations cut down the overhead added by eslint-plugin-unicorn by more than 50%, at least in our setup. The biggest bang for the buck came from hoisting constants (2s reduction) and avoiding for-of transpilation (2s reduction).

@fasttime
Copy link
Member

fasttime commented Apr 7, 2023

I went ahead and reimplemented utils.search with a binary search algorithm in #17066. The unit tests in tests/lib/source-code/token-store.js only use 10 tokens, which is not enough to make a noticeable difference in performance, and also not a realistic scenario for a typical use case. But in the perf tests, the overall impact of the function seems too negligible to be measured. Any ideas how this could be tested?

@mdjermanovic
Copy link
Member Author

I went ahead and reimplemented utils.search with a binary search algorithm in #17066. The unit tests in tests/lib/source-code/token-store.js only use 10 tokens, which is not enough to make a noticeable difference in performance, and also not a realistic scenario for a typical use case. But in the perf tests, the overall impact of the function seems too negligible to be measured. Any ideas how this could be tested?

That part of the blog post mentions JSDoc rules that ESLint uses when linting its own codebase (npm run lint). Maybe you could try this: set TIMING=all env variable, then compare times for jsdoc/* rules.

@fasttime
Copy link
Member

fasttime commented Apr 7, 2023

That part of the blog post mentions JSDoc rules that ESLint uses when linting its own codebase (npm run lint). Maybe you could try this: set TIMING=all env variable, then compare times for jsdoc/* rules.

Thanks @mdjermanovic, this method shows a clear performance improvement for jsdoc/* rules when the binary search is used. I've updated my PR to include these results.

@kkimdev
Copy link

kkimdev commented Sep 7, 2023

I'm not familiar with the internals of ESLint, but I'm just wondering: can we execute different rules in parallel on different cores(/processes)? If this is feasible, it seems like an easy, low-hanging fruit for speeding up ESLint.

@nzakas
Copy link
Member

nzakas commented Sep 7, 2023

@kkimdev there's a long discussion about what can potentially be parallelized here:
#3565

Doing it per rule would likely slow things down because run sometimes thousands of rules and most of the time we're dealing with 4-8 cores.

@nzakas
Copy link
Member

nzakas commented Sep 7, 2023

Closing this issue as the planned work has been completed.

@nzakas nzakas closed this as completed Sep 7, 2023
@eslint-github-bot eslint-github-bot bot locked and limited conversation to collaborators Mar 6, 2024
@eslint-github-bot eslint-github-bot bot added the archived due to age This issue has been archived; please open a new issue for any further discussion label Mar 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
accepted There is consensus among the team that this change meets the criteria for inclusion archived due to age This issue has been archived; please open a new issue for any further discussion core Relates to ESLint's core APIs and features
Projects
Archived in project
Development

No branches or pull requests

7 participants