Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Binary search in token store utils.search #17066

Merged
merged 3 commits into from Apr 11, 2023

Conversation

fasttime
Copy link
Member

@fasttime fasttime commented Apr 7, 2023

Prerequisites checklist

What is the purpose of this pull request? (put an "X" next to an item)

[ ] Documentation update
[ ] Bug fix (template)
[ ] New rule (template)
[ ] Changes an existing rule (template)
[ ] Add autofix to a rule
[ ] Add a CLI option
[ ] Add something to the core
[x] Other, please explain:

Improve performance

What changes did you make? (Give an overview)

Changed token store's utils.search function logic to use binary search. This reduces the time complexity of the function from linear to logarithmic time, and results in a performance gain for sufficiently large arrays of tokens.

References #16962

Measuring performance

To measure the performance improvement, I considered the time spent by jsdoc/* rules while running npm run lint as suggested in #16962 (comment). On a zsh terminal, I run the command

TIMING=all npm run lint | grep '^jsdoc/' > output.txt

and used a script to calculate the sum of the time and the total relative time usage from the generated text file.

Another useful measure I discovered is the time spent by Node.js in evaluating BackwardTokenCommentCursor, which can be captured using the rudimentary built-in profiler. BackwardTokenCommentCursor makes heavy use of the utils.search function, so I used this value to validate (double-check) the results I was getting with the previous command.

My script:

#!/bin/sh
node --prof ./bin/eslint.js . --ignore-pattern "docs/**"
node --prof-process isolate-*-v8.log | grep '%  LazyCompile: \*BackwardTokenCommentCursor'
rm isolate-*-v8.log

And an example output:

   1146    3.9%    4.0%  LazyCompile: *BackwardTokenCommentCursor /Users/noone/GitHub/eslint/lib/source-code/token-store/backward-token-comment-cursor.js:31:16
Preliminary work

I run the test on three variations of the function utils.search: the original code with findIndex, my binary search implementation and the same binary search with an additional optimization that removes a conditional jump in 50% of the iterations on average, which Wikipedia calls alternative procedure.

The code and the respective timing details are reported below.
It seems that the binary search does indeed reduce the time spent by jsdoc/* rules, although by a small amount. On the other hand, the extra optimization (alternative procedure) does not seem to improve performance measurably.

With findIndex
JSDoc rule stats
Rule                                          | Time (ms) | Relative

jsdoc/check-access                            |  1144.882 |     4.6%
jsdoc/valid-types                             |   762.166 |     3.0%
jsdoc/check-types                             |   693.700 |     2.8%
jsdoc/check-tag-names                         |   554.857 |     2.2%
jsdoc/check-alignment                         |   435.005 |     1.7%
jsdoc/check-values                            |   433.129 |     1.7%
jsdoc/tag-lines                               |   421.243 |     1.7%
jsdoc/check-property-names                    |   419.610 |     1.7%
jsdoc/check-line-alignment                    |   391.637 |     1.6%
jsdoc/require-hyphen-before-param-description |   385.190 |     1.5%
jsdoc/no-multi-asterisks                      |   384.382 |     1.5%
jsdoc/require-asterisk-prefix                 |   378.696 |     1.5%
jsdoc/multiline-blocks                        |   378.303 |     1.5%
jsdoc/empty-tags                              |   364.888 |     1.5%
jsdoc/require-property-name                   |   360.920 |     1.4%
jsdoc/newline-after-description               |   357.552 |     1.4%
jsdoc/require-property                        |   356.765 |     1.4%
jsdoc/require-property-type                   |   355.029 |     1.4%
jsdoc/check-syntax                            |   352.481 |     1.4%
jsdoc/require-property-description            |   350.985 |     1.4%
jsdoc/check-param-names                       |   228.035 |     0.9%
jsdoc/require-param                           |   191.441 |     0.8%
jsdoc/require-description                     |   168.819 |     0.7%
jsdoc/require-returns-check                   |   107.597 |     0.4%
jsdoc/implements-on-classes                   |    88.855 |     0.4%
jsdoc/require-throws                          |    88.207 |     0.4%
jsdoc/require-returns                         |    87.368 |     0.3%
jsdoc/require-yields-check                    |    79.495 |     0.3%
jsdoc/require-param-description               |    69.648 |     0.3%
jsdoc/require-param-type                      |    61.160 |     0.2%
jsdoc/require-returns-description             |    61.102 |     0.2%
jsdoc/require-param-name                      |    52.961 |     0.2%
jsdoc/require-returns-type                    |    52.606 |     0.2%
jsdoc/no-bad-blocks                           |    44.459 |     0.2%
jsdoc/require-jsdoc                           |    40.598 |     0.2%

TOTAL                                         | 10703.771 |    45.6%
With binary search
Code
exports.search = function search(tokens, location) {
    for (let minIndex = 0, maxIndex = tokens.length - 1; minIndex <= maxIndex;) {
        const index = Math.trunc((minIndex + maxIndex) / 2);
        const token = tokens[index];

        if (location <= getStartLocation(token)) {
            if (index === minIndex) {
                return index;
            }
            maxIndex = index;
        } else {
            minIndex = index + 1;
        }
    }
    return tokens.length;
};
JSDoc rule stats
Rule                                          | Time (ms) | Relative

jsdoc/check-access                            |  1009.439 |     4.1%
jsdoc/valid-types                             |   733.300 |     3.0%
jsdoc/check-types                             |   628.300 |     2.6%
jsdoc/check-tag-names                         |   470.814 |     1.9%
jsdoc/check-values                            |   442.890 |     1.8%
jsdoc/check-alignment                         |   423.225 |     1.7%
jsdoc/tag-lines                               |   393.989 |     1.6%
jsdoc/check-property-names                    |   388.035 |     1.6%
jsdoc/check-line-alignment                    |   368.785 |     1.5%
jsdoc/require-hyphen-before-param-description |   359.410 |     1.5%
jsdoc/empty-tags                              |   348.269 |     1.4%
jsdoc/require-asterisk-prefix                 |   346.203 |     1.4%
jsdoc/multiline-blocks                        |   346.189 |     1.4%
jsdoc/check-syntax                            |   343.351 |     1.4%
jsdoc/newline-after-description               |   340.988 |     1.4%
jsdoc/no-multi-asterisks                      |   339.377 |     1.4%
jsdoc/require-property-type                   |   338.138 |     1.4%
jsdoc/require-property-description            |   323.686 |     1.3%
jsdoc/require-property                        |   321.812 |     1.3%
jsdoc/require-property-name                   |   320.590 |     1.3%
jsdoc/check-param-names                       |   252.819 |     1.0%
jsdoc/require-param                           |   211.383 |     0.9%
jsdoc/require-description                     |   140.157 |     0.6%
jsdoc/implements-on-classes                   |   109.975 |     0.4%
jsdoc/require-returns-check                   |    88.907 |     0.4%
jsdoc/require-returns                         |    86.985 |     0.4%
jsdoc/require-throws                          |    86.723 |     0.4%
jsdoc/require-yields-check                    |    71.648 |     0.3%
jsdoc/require-param-description               |    69.968 |     0.3%
jsdoc/require-param-type                      |    62.769 |     0.3%
jsdoc/require-returns-description             |    60.125 |     0.2%
jsdoc/require-param-name                      |    55.649 |     0.2%
jsdoc/require-returns-type                    |    55.416 |     0.2%
jsdoc/no-bad-blocks                           |    55.093 |     0.2%
jsdoc/require-jsdoc                           |    38.887 |     0.2%

TOTAL                                         | 10033.294 |    41.0%
With binary search (alternative procedure)
Code
exports.search = function search(tokens, location) {
    let minIndex = 0,
        maxIndex = tokens.length - 1;

    while (minIndex < maxIndex) {
        const index = Math.trunc((minIndex + maxIndex) / 2);
        const token = tokens[index];

        if (location <= getStartLocation(token)) {
            if (index === minIndex) {
                return index;
            }
            maxIndex = index;
        } else {
            minIndex = index + 1;
        }
    }
    if (minIndex === maxIndex) {
        const token = tokens[minIndex];

        if (location <= getStartLocation(token)) {
            return minIndex;
        }
    }
    return tokens.length;
};
JSDoc rule stats
Rule                                          | Time (ms) | Relative

jsdoc/check-access                            |  1070.535 |     4.4%
jsdoc/valid-types                             |   782.850 |     3.2%
jsdoc/check-types                             |   636.658 |     2.6%
jsdoc/check-tag-names                         |   479.428 |     2.0%
jsdoc/check-values                            |   425.001 |     1.7%
jsdoc/check-alignment                         |   405.592 |     1.7%
jsdoc/tag-lines                               |   398.764 |     1.6%
jsdoc/check-property-names                    |   391.845 |     1.6%
jsdoc/check-line-alignment                    |   374.767 |     1.5%
jsdoc/require-hyphen-before-param-description |   360.293 |     1.5%
jsdoc/no-multi-asterisks                      |   353.639 |     1.4%
jsdoc/empty-tags                              |   344.749 |     1.4%
jsdoc/require-asterisk-prefix                 |   339.404 |     1.4%
jsdoc/multiline-blocks                        |   331.412 |     1.3%
jsdoc/require-property                        |   327.072 |     1.3%
jsdoc/require-property-type                   |   320.882 |     1.3%
jsdoc/newline-after-description               |   319.103 |     1.3%
jsdoc/check-syntax                            |   319.083 |     1.3%
jsdoc/require-property-description            |   316.157 |     1.3%
jsdoc/require-property-name                   |   315.910 |     1.3%
jsdoc/check-param-names                       |   226.817 |     0.9%
jsdoc/require-param                           |   193.750 |     0.8%
jsdoc/require-description                     |   151.536 |     0.6%
jsdoc/require-throws                          |   106.919 |     0.4%
jsdoc/implements-on-classes                   |   103.705 |     0.4%
jsdoc/require-returns                         |    89.655 |     0.4%
jsdoc/require-returns-check                   |    86.759 |     0.4%
jsdoc/require-yields-check                    |    71.148 |     0.3%
jsdoc/require-param-description               |    67.077 |     0.3%
jsdoc/require-returns-type                    |    65.888 |     0.3%
jsdoc/require-param-type                      |    64.034 |     0.3%
jsdoc/require-param-name                      |    63.201 |     0.3%
jsdoc/require-returns-description             |    60.999 |     0.2%
jsdoc/no-bad-blocks                           |    47.573 |     0.2%
jsdoc/require-jsdoc                           |    38.881 |     0.2%
                                               
TOTAL                                         | 10051.086 |    41.1%
Update

After the first performance tests showed positive results I did a couple more changes to the original code. The first change consist in replacing a call to Math.trunc() with a bitwise-or operation (| 0) to round a value to an integer. This bitwise operation was part of an earlier implementation (still found in the first commit), but later we switched to Math.trunc() for clarity. Unfortunately, this method showed a negative impact on performance (as did other Math functions), so in the end we decided to switch back to a bitwise operation and to add a comment to clarify the intent.

Another change I did was inlining the only call to the function getSourceLocation. This function was previously required as a callback for Array#findIndex(), and earlier for Lodash sortedIndexBy(), but now that we control the search implementation, having this separate function seems unnecessary, as it only contains minimal logic, is not reused and does not help to clarify the code. It could also reduce performance because of the overhead of the additional invocation, although the effect would be negligible and difficult to quantify in practice.

My tests suggest that the new binary search reduces the linting time for JSDoc rules of 6~11%, and the overall execution time for npm run lint is reduced by up to 3%.

This improvement is consistent with a reduced time spent by BackwardTokenCommentCursor as captured by the Node.js profiler (lower is better):

Before the change:

   1146    3.9%    4.0%  LazyCompile: *BackwardTokenCommentCursor /Users/noone/GitHub/eslint/lib/source-code/token-store/backward-token-comment-cursor.js:31:16

After the change:

    331    1.2%    1.2%  LazyCompile: *BackwardTokenCommentCursor /Users/noone/GitHub/eslint/lib/source-code/token-store/backward-token-comment-cursor.js:31:16

Is there anything you'd like reviewers to focus on?

I'd be interested to know if others are getting similar timing stats.

@eslint-github-bot eslint-github-bot bot added the chore This change is not user-facing label Apr 7, 2023
@netlify
Copy link

netlify bot commented Apr 7, 2023

Deploy Preview for docs-eslint ready!

Name Link
🔨 Latest commit 55dbf53
🔍 Latest deploy log https://app.netlify.com/sites/docs-eslint/deploys/643404e146bf460008e6ff85
😎 Deploy Preview https://deploy-preview-17066--docs-eslint.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@fasttime fasttime mentioned this pull request Apr 7, 2023
@fasttime fasttime marked this pull request as ready for review April 11, 2023 09:17
@fasttime fasttime requested a review from a team as a code owner April 11, 2023 09:17
Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Really nice work! Leaving open for @mdjermanovic to verify his feedback was addressed.

Copy link
Member

@mdjermanovic mdjermanovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@mdjermanovic mdjermanovic merged commit 9d1b8fc into main Apr 11, 2023
21 checks passed
@mdjermanovic mdjermanovic deleted the token-store-binary-search branch April 11, 2023 21:44
crapStone pushed a commit to Calciumdibromid/CaBr2 that referenced this pull request Apr 27, 2023
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [eslint](https://eslint.org) ([source](https://github.com/eslint/eslint)) | devDependencies | minor | [`8.37.0` -> `8.39.0`](https://renovatebot.com/diffs/npm/eslint/8.37.0/8.39.0) |

---

### Release Notes

<details>
<summary>eslint/eslint</summary>

### [`v8.39.0`](https://github.com/eslint/eslint/releases/tag/v8.39.0)

[Compare Source](eslint/eslint@v8.38.0...v8.39.0)

#### Features

-   [`3f7af9f`](eslint/eslint@3f7af9f) feat: Implement `SourceCode#markVariableAsUsed()` ([#&#8203;17086](eslint/eslint#17086)) (Nicholas C. Zakas)

#### Documentation

-   [`6987dc5`](eslint/eslint@6987dc5) docs: Fix formatting in Custom Rules docs ([#&#8203;17097](eslint/eslint#17097)) (Milos Djermanovic)
-   [`4ee92e5`](eslint/eslint@4ee92e5) docs: Update README (GitHub Actions Bot)
-   [`d8e9887`](eslint/eslint@d8e9887) docs: Custom Rules cleanup/expansion ([#&#8203;16906](eslint/eslint#16906)) (Ben Perlmutter)
-   [`1fea279`](eslint/eslint@1fea279) docs: Clarify how to add to tsc agenda ([#&#8203;17084](eslint/eslint#17084)) (Nicholas C. Zakas)
-   [`970ef1c`](eslint/eslint@970ef1c) docs: Update triage board location (Nicholas C. Zakas)
-   [`6d8bffd`](eslint/eslint@6d8bffd) docs: Update README (GitHub Actions Bot)

#### Chores

-   [`60a6f26`](eslint/eslint@60a6f26) chore: upgrade [@&#8203;eslint/js](https://github.com/eslint/js)[@&#8203;8](https://github.com/8).39.0 ([#&#8203;17102](eslint/eslint#17102)) (Milos Djermanovic)
-   [`d5ba5c0`](eslint/eslint@d5ba5c0) chore: package.json update for [@&#8203;eslint/js](https://github.com/eslint/js) release (ESLint Jenkins)
-   [`f57eff2`](eslint/eslint@f57eff2) ci: run tests on Node.js v20 ([#&#8203;17093](eslint/eslint#17093)) (Nitin Kumar)
-   [`9d1b8fc`](eslint/eslint@9d1b8fc) perf: Binary search in token store `utils.search` ([#&#8203;17066](eslint/eslint#17066)) (Francesco Trotta)
-   [`07a4435`](eslint/eslint@07a4435) chore: Add request for minimal repro to bug report ([#&#8203;17081](eslint/eslint#17081)) (Nicholas C. Zakas)
-   [`eac4943`](eslint/eslint@eac4943) refactor: remove unnecessary use of `SourceCode#getAncestors` in rules ([#&#8203;17075](eslint/eslint#17075)) (Milos Djermanovic)
-   [`0a7b60a`](eslint/eslint@0a7b60a) chore: update description of `SourceCode#getDeclaredVariables` ([#&#8203;17072](eslint/eslint#17072)) (Milos Djermanovic)
-   [`6e2df71`](eslint/eslint@6e2df71) chore: remove unnecessary references to the LICENSE file ([#&#8203;17071](eslint/eslint#17071)) (Milos Djermanovic)

### [`v8.38.0`](https://github.com/eslint/eslint/releases/tag/v8.38.0)

[Compare Source](eslint/eslint@v8.37.0...v8.38.0)

#### Features

-   [`a1d561d`](eslint/eslint@a1d561d) feat: Move getDeclaredVariables and getAncestors to SourceCode ([#&#8203;17059](eslint/eslint#17059)) (Nicholas C. Zakas)

#### Bug Fixes

-   [`1c1ece2`](eslint/eslint@1c1ece2) fix: do not report on `RegExp(...args)` in `require-unicode-regexp` ([#&#8203;17037](eslint/eslint#17037)) (Francesco Trotta)

#### Documentation

-   [`7162d34`](eslint/eslint@7162d34) docs: Mention new config system is complete ([#&#8203;17068](eslint/eslint#17068)) (Nicholas C. Zakas)
-   [`0fd6bb2`](eslint/eslint@0fd6bb2) docs: Update README (GitHub Actions Bot)
-   [`c83531c`](eslint/eslint@c83531c) docs: Update/remove external links, eg. point to `eslint-community` ([#&#8203;17061](eslint/eslint#17061)) (Pelle Wessman)
-   [`a3aa6f5`](eslint/eslint@a3aa6f5) docs: Clarify `no-div-regex` rule docs ([#&#8203;17051](eslint/eslint#17051)) (Francesco Trotta)
-   [`b0f11cf`](eslint/eslint@b0f11cf) docs: Update README (GitHub Actions Bot)
-   [`da8d52a`](eslint/eslint@da8d52a) docs: Update the second object instance for the "no-new" rule ([#&#8203;17020](eslint/eslint#17020)) (Ahmadou Waly NDIAYE)
-   [`518130a`](eslint/eslint@518130a) docs: switch language based on current path ([#&#8203;16687](eslint/eslint#16687)) (Percy Ma)
-   [`24206c4`](eslint/eslint@24206c4) docs: Update README (GitHub Actions Bot)

#### Chores

-   [`59ed060`](eslint/eslint@59ed060) chore: upgrade [@&#8203;eslint/js](https://github.com/eslint/js)[@&#8203;8](https://github.com/8).38.0 ([#&#8203;17069](eslint/eslint#17069)) (Milos Djermanovic)
-   [`88c0898`](eslint/eslint@88c0898) chore: package.json update for [@&#8203;eslint/js](https://github.com/eslint/js) release (ESLint Jenkins)
-   [`cf682d2`](eslint/eslint@cf682d2) refactor: simplify new-parens rule schema ([#&#8203;17060](eslint/eslint#17060)) (MHO)
-   [`0dde022`](eslint/eslint@0dde022) ci: bump actions/add-to-project from 0.4.1 to 0.5.0 ([#&#8203;17055](eslint/eslint#17055)) (dependabot\[bot])

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNS4zNS4xIiwidXBkYXRlZEluVmVyIjoiMzUuNTUuMSJ9-->

Co-authored-by: cabr2-bot <cabr2.help@gmail.com>
Reviewed-on: https://codeberg.org/Calciumdibromid/CaBr2/pulls/1852
Reviewed-by: Epsilon_02 <epsilon_02@noreply.codeberg.org>
Co-authored-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
Co-committed-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
@eslint-github-bot eslint-github-bot bot locked and limited conversation to collaborators Oct 9, 2023
@eslint-github-bot eslint-github-bot bot added the archived due to age This issue has been archived; please open a new issue for any further discussion label Oct 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
archived due to age This issue has been archived; please open a new issue for any further discussion chore This change is not user-facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants