Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: use fast api calls for existsSync #49893

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

littledivy
Copy link
Member

@littledivy littledivy commented Sep 27, 2023

Currently, It takes the fast route when path string is represented as a OneByteString in V8.

                                                          confidence improvement accuracy (*)   (**)  (***)
fs/bench-existsSync.js n=1000000 type='existing'                 ***      2.87 %       ±0.68% ±0.91% ±1.19%
fs/bench-existsSync.js n=1000000 type='non-existing'             ***     43.04 %       ±1.17% ±1.56% ±2.03%
fs/bench-existsSync.js n=1000000 type='non-flat-existing'        ***      2.63 %       ±0.42% ±0.56% ±0.73%
n=1000000 type='non-existing' 836103.841448331                    NA       NaN %           NA     NA     NA

Be aware that when doing many comparisons the risk of a false-positive result increases.
In this case, there are 4 comparisons, you can thus expect the following amount of false-positive results:
  0.20 false positives, when considering a   5% risk acceptance (*, **, ***),
  0.04 false positives, when considering a   1% risk acceptance (**, ***),
  0.00 false positives, when considering a 0.1% risk acceptance (***)

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. labels Sep 27, 2023
@panva panva added performance Issues and PRs related to the performance of Node.js. request-ci Add this label to start a Jenkins CI on a PR. labels Sep 27, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 27, 2023
@nodejs-github-bot
Copy link
Collaborator

@anonrig anonrig added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Sep 27, 2023
@anonrig
Copy link
Member

anonrig commented Sep 27, 2023

Can you resolve the conflict @littledivy?

Copy link
Member

@debadree25 debadree25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess we have to add a test like in test/parallel/test-url-canParse-whatwg.js ?

@anonrig anonrig added the needs-benchmark-ci PR that need a benchmark CI run. label Sep 27, 2023
@anonrig anonrig removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Sep 27, 2023
@anonrig anonrig added the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Sep 27, 2023
src/node_file.cc Outdated Show resolved Hide resolved
src/node_file.cc Outdated Show resolved Hide resolved
@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 27, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 27, 2023
@nodejs-github-bot
Copy link
Collaborator

@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 27, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 27, 2023
@nodejs-github-bot
Copy link
Collaborator

Comment on lines +1120 to +1135
uv_fs_t req;
auto make = OnScopeLeave([&req]() { uv_fs_req_cleanup(&req); });
FS_SYNC_TRACE_BEGIN(access);
int err = uv_fs_access(nullptr, &req, path.out(), 0, nullptr);
FS_SYNC_TRACE_END(access);

#ifdef _WIN32
// In case of an invalid symlink, `uv_fs_access` on win32
// will **not** return an error and is therefore not enough.
// Double check with `uv_fs_stat()`.
if (err == 0) {
FS_SYNC_TRACE_BEGIN(stat);
err = uv_fs_stat(nullptr, &req, path.out(), nullptr);
FS_SYNC_TRACE_END(stat);
}
#endif // _WIN32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fs operation part can be wrapped in a helper and shared with the slow callback to avoid getting out of sync.

// This test is to ensure that the v8 fast api works.
const oneBytePath = 'hello.txt';
for (let i = 0; i < 1e5; i++) {
assert(!fs.existsSync(oneBytePath));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to test/pummel instead. But also, in general we need to avoid running tight loops in the tests to avoid introducing timeouts in the CI on the slower machines. Maybe it's already enough that the fast path is exercised in the benchmark..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use V8 natives API like they do unit test fast calls? IIRC you need to %PrepareForOptimization(fn), call the function, %OptimizeOnNextCall(fn), and call it again. That last call should be optimized.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, while I'm not good enough with C++ to suggest how to implement it, I think we can put something into place (maybe only in debug builds?). For example, the fast version, when called, increases some counter that we can get from JavaScript for an assertion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep an array of booleans for all fast APIs to see if they are called, and expose them to the JS land, toggling a boolean shouldn't be very expensive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this PR though doing %PrepareForOptimization() and %OptimizeOnNextCall() in the tests may be fine. But we also need to check in JS land if the optimizing compiler is enabled at all in the test to avoid failing on builds that turns optimizations off, which would be tricky..

huozhi pushed a commit to vercel/next.js that referenced this pull request Oct 3, 2023
Using `await fs.access` has couple of downsides. It creates unnecessary
async contexts where async scope can be removed. Also, it creates the
possibility of race conditions such as `Time-of-Check to Time-of-Use`.

It would be nice if someone can benchmark this. I'm rooting for a
performance improvement.

Some updates from Node.js land:

- There is an open pull request to add V8 Fast API to `existsSync`
method - nodejs/node#49893
- Non-existing `existsSync` executions became 30% faster -
nodejs/node#49593

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
@aduh95
Copy link
Contributor

aduh95 commented May 11, 2024

This needs a rebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. fs Issues and PRs related to the fs subsystem / file system. needs-benchmark-ci PR that need a benchmark CI run. needs-ci PRs that need a full CI run. performance Issues and PRs related to the performance of Node.js.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet