Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: use fast api calls for existsSync #49893

Closed
wants to merge 8 commits into from

Conversation

littledivy
Copy link
Member

@littledivy littledivy commented Sep 27, 2023

Currently, It takes the fast route when path string is represented as a OneByteString in V8.

                                                          confidence improvement accuracy (*)   (**)  (***)
fs/bench-existsSync.js n=1000000 type='existing'                 ***      2.87 %       ±0.68% ±0.91% ±1.19%
fs/bench-existsSync.js n=1000000 type='non-existing'             ***     43.04 %       ±1.17% ±1.56% ±2.03%
fs/bench-existsSync.js n=1000000 type='non-flat-existing'        ***      2.63 %       ±0.42% ±0.56% ±0.73%
n=1000000 type='non-existing' 836103.841448331                    NA       NaN %           NA     NA     NA

Be aware that when doing many comparisons the risk of a false-positive result increases.
In this case, there are 4 comparisons, you can thus expect the following amount of false-positive results:
  0.20 false positives, when considering a   5% risk acceptance (*, **, ***),
  0.04 false positives, when considering a   1% risk acceptance (**, ***),
  0.00 false positives, when considering a 0.1% risk acceptance (***)

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. labels Sep 27, 2023
@panva panva added performance Issues and PRs related to the performance of Node.js. request-ci Add this label to start a Jenkins CI on a PR. labels Sep 27, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 27, 2023
@nodejs-github-bot
Copy link
Collaborator

@anonrig anonrig added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Sep 27, 2023
@anonrig
Copy link
Member

anonrig commented Sep 27, 2023

Can you resolve the conflict @littledivy?

Copy link
Member

@debadree25 debadree25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess we have to add a test like in test/parallel/test-url-canParse-whatwg.js ?

@anonrig anonrig added the needs-benchmark-ci PR that need a benchmark CI run. label Sep 27, 2023
@anonrig anonrig removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Sep 27, 2023
@anonrig anonrig added the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Sep 27, 2023
src/node_file.cc Outdated Show resolved Hide resolved
src/node_file.cc Outdated Show resolved Hide resolved
@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 27, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 27, 2023
@nodejs-github-bot
Copy link
Collaborator

@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 27, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 27, 2023
@nodejs-github-bot
Copy link
Collaborator

Comment on lines +1120 to +1135
uv_fs_t req;
auto make = OnScopeLeave([&req]() { uv_fs_req_cleanup(&req); });
FS_SYNC_TRACE_BEGIN(access);
int err = uv_fs_access(nullptr, &req, path.out(), 0, nullptr);
FS_SYNC_TRACE_END(access);

#ifdef _WIN32
// In case of an invalid symlink, `uv_fs_access` on win32
// will **not** return an error and is therefore not enough.
// Double check with `uv_fs_stat()`.
if (err == 0) {
FS_SYNC_TRACE_BEGIN(stat);
err = uv_fs_stat(nullptr, &req, path.out(), nullptr);
FS_SYNC_TRACE_END(stat);
}
#endif // _WIN32
Copy link
Member

@joyeecheung joyeecheung Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fs operation part can be wrapped in a helper and shared with the slow callback to avoid getting out of sync.

// This test is to ensure that the v8 fast api works.
const oneBytePath = 'hello.txt';
for (let i = 0; i < 1e5; i++) {
assert(!fs.existsSync(oneBytePath));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to test/pummel instead. But also, in general we need to avoid running tight loops in the tests to avoid introducing timeouts in the CI on the slower machines. Maybe it's already enough that the fast path is exercised in the benchmark..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use V8 natives API like they do unit test fast calls? IIRC you need to %PrepareForOptimization(fn), call the function, %OptimizeOnNextCall(fn), and call it again. That last call should be optimized.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, while I'm not good enough with C++ to suggest how to implement it, I think we can put something into place (maybe only in debug builds?). For example, the fast version, when called, increases some counter that we can get from JavaScript for an assertion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep an array of booleans for all fast APIs to see if they are called, and expose them to the JS land, toggling a boolean shouldn't be very expensive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this PR though doing %PrepareForOptimization() and %OptimizeOnNextCall() in the tests may be fine. But we also need to check in JS land if the optimizing compiler is enabled at all in the test to avoid failing on builds that turns optimizations off, which would be tricky..

huozhi pushed a commit to vercel/next.js that referenced this pull request Oct 3, 2023
Using `await fs.access` has couple of downsides. It creates unnecessary
async contexts where async scope can be removed. Also, it creates the
possibility of race conditions such as `Time-of-Check to Time-of-Use`.

It would be nice if someone can benchmark this. I'm rooting for a
performance improvement.

Some updates from Node.js land:

- There is an open pull request to add V8 Fast API to `existsSync`
method - nodejs/node#49893
- Non-existing `existsSync` executions became 30% faster -
nodejs/node#49593

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
@aduh95
Copy link
Contributor

aduh95 commented May 11, 2024

This needs a rebase.

@RedYetiDev
Copy link
Member

👋 Hey, this has approvals and a decent number of reviews, but currently has conflcits with the main branch. @aduh95 pointed out a few months ago that this needs a rebase, but no action has been taken since then.

I've marked this PR as stalled for these reasons, LMK if this wasn't the right thing to do, and feel free to undo it :-).

@RedYetiDev RedYetiDev added the stalled Issues and PRs that are stalled. label Aug 9, 2024
Copy link
Contributor

github-actions bot commented Aug 9, 2024

This issue/PR was marked as stalled, it will be automatically closed in 30 days. If it should remain open, please leave a comment explaining why it should remain open.

@RedYetiDev RedYetiDev closed this Oct 15, 2024
@littledivy
Copy link
Member Author

Sorry, I haven't had a chance to rebase. Please feel free to take over and get this landed.

@RedYetiDev
Copy link
Member

You can always re-open, the stalled bot isn't working, so I was just doing some maintenance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. fs Issues and PRs related to the fs subsystem / file system. needs-benchmark-ci PR that need a benchmark CI run. needs-ci PRs that need a full CI run. performance Issues and PRs related to the performance of Node.js. stalled Issues and PRs that are stalled.
Projects
None yet
Development

Successfully merging this pull request may close these issues.