Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP fs: implement fs.rmdir recursive #24252

Closed
wants to merge 1 commit into from

Conversation

shisama
Copy link
Contributor

@shisama shisama commented Nov 8, 2018

Added recursive option into fs.rmdir and fs.rmdirSync to
delete a forder with sub folders or files.

Fix #22686

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. fs Issues and PRs related to the fs subsystem / file system. labels Nov 8, 2018
Added recursive option into fs.rmdir and fs.rmdirSync to
delete a forder with sub folders or files.
@richardlau
Copy link
Member

cc @boneskull

@Trott Trott added semver-minor PRs that contain new features and should be released in the next minor version. notable-change PRs with changes that should be highlighted in changelogs. labels Nov 9, 2018
@sam-github
Copy link
Contributor

Recursive rmdir can fail in ways very different from the previous function, please document what happens when it fails.

I think this function is not well named, and is less useful, and more fragile, than https://github.com/isaacs/rimraf.

  1. It recursively removes directories AND files in those directories. Unless the top-level argument is a file... in which case I guess it fails? This is an annoying fallout of modifying rmdir rather than having a new function. rimraf doesn't have this problem.
  2. It's more fragile than rimraf, which has embedded retry logic for a number of ephemeral failure conditions. This function will fail in difficult to reproduce ways on large file sets, particularly on Windows.

If node core integrates features that are well served by a community module (a contentious issue in itself), it should aim to be clearly better than the community module, or at least equivalently useful.

@boneskull
Copy link
Contributor

There's a considerable duplication of effort here... see https://github.com/boneskull/node/tree/rimraf

@boneskull
Copy link
Contributor

I've also found via benchmarks that a C/C++ implementation that removes files/dirs in sequence is not significantly faster (and in some cases worse) than rimraf. The implementation must work in parallel in order for it to be worth adding more C/C++ to the codebase.

@shisama
Copy link
Contributor Author

shisama commented Nov 10, 2018

@sam-github

If node core integrates features that are well served by a community module (a contentious issue in itself), it should aim to be clearly better than the community module, or at least equivalently useful.

I agree this. Thank you to tell me about rimraf. I will check implementation of rimral and try to make my implementation better.

@shisama
Copy link
Contributor Author

shisama commented Nov 10, 2018

@boneskull Oh...I didn't know that you are implementating this feature.

The implementation must work in parallel in order for it to be worth adding more C/C++ to the codebase.

I see. I will try to implement with thread. Are you implementing it that works in parallel?

@boneskull
Copy link
Contributor

@shisama My implementation follows rimraf's logic (sans some windows-specific stuff that will need to be addressed).

I've benchmarked my implementation (sync and async) against rimraf, and it's really no better. That said, rimraf does file operations in parallel, and my implementation (as yours) works sequentially.

Unless there's a significant performance advantage to a C/C++ implementation, the general feeling I get is if we can do it in JavaScript... do it in JavaScript. This doesn't mean recursive rmdir doesn't belong in Node.js core, but rather that the implementation might belong in lib/ instead of src/.

My next step on my implementation would be to execute some number of filesystem operations in parallel (I don't think anything like this already exists in Node.js' codebase). Compounding the issue, the sync implementation must do its parallel operations in a blocking manner. That means faking it--some wizardry like what execSync does will need to be put in place--it needs its own uv_loop_t to work with.

Join us in nodejs/tooling!

@shisama
Copy link
Contributor Author

shisama commented Nov 12, 2018

@boneskull Thank you for more information. Would you like me to close this PR or leave it open untill you create a new PR?
Thanks for information about nodejs/tooling I checked it and joined node-tooling slack.

@shisama shisama changed the title fs: implement fs.rmdir recursive WIP fs: implement fs.rmdir recursive Nov 12, 2018
@Trott
Copy link
Member

Trott commented Nov 18, 2018

@boneskull @shisama Can the two of you come to a conclusion about whether this should be closed in favor of another implementation or if this should remain open and is a viable possibility?

@Trott Trott added the wip Issues and PRs that are still a work in progress. label Nov 20, 2018
@guybedford
Copy link
Contributor

@shisama I believe the general thoughts on this from the @nodejs/tooling group were that starting by implementing in JS would allow avoiding some of these async hurdles, while also providing a base-level performance comparison to then move into the C++ work again in future. If you'd be interested in working on this further, perhaps we can move the discussion and implementation feedback process there until there is something ready to PR again?

@dead-claudia
Copy link

Just a stray thought from someone who's like 0% involved with this: if you're going to do it natively, I think it'd be a good idea to investigate the various platform-specific functionality for things that could potentially optimize at least parts of it. In particular:

  • Windows has a user mode shell API that does this internally and it may be faster.
  • Linux itself has fts_*, which explicitly optimizes directory traversal and offers both pre- and postorder visitation of directory nodes.

Both of these would require you to explicitly schedule thread pool work, but if the OS optimizes for them, they would speed up deletion a lot.

@shisama
Copy link
Contributor Author

shisama commented Dec 15, 2018

@guybedford Thank you for your comment. I agree with you. This implementation has performance issue because of running sequentially. rimraf runs faster than this implementation because it removes files and directories in parallel. I think it should be run in parallel but it is a little difficult for me. So, I will close this PR once.

@shisama shisama closed this Dec 15, 2018
@shisama
Copy link
Contributor Author

shisama commented Dec 15, 2018

Thank you all for your comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. fs Issues and PRs related to the fs subsystem / file system. notable-change PRs with changes that should be highlighted in changelogs. semver-minor PRs that contain new features and should be released in the next minor version. wip Issues and PRs that are still a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

An optional parameter that lets us delete folders that aren't empty directly
8 participants