-
Notifications
You must be signed in to change notification settings - Fork 838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WSL pins opened directories #1529
Comments
@therealkenc - Apologize for the delay here. I am sure there are other neglected issues, and I personally would like to get through them too. Anyways, I think this stems from the NTFS limitation that prevents renaming directories which have a handle open to anything below. @SvenGroot to confirm. |
Other than 'try to not open lots of files' is there any advice on avoiding this issue, given that a fix seems to be escaping us? Is there a way to monitor and close file handles? Is there a more suitable filesystem for hosting WSL filesystems that would avoid hitting this? Is there a node setting we can flag to ignore these errors, given that they're apparently false positives? |
Assigned to @SvenGroot and @tara-raj to take a look, though with Xmas fast approaching, may be a couple of weeks until this gets looked into. Interested party here: https://twitter.com/rainabba/status/1075192299791908864 |
There has been chirping crickets since 2016, is is fairly safe to assume, mostly because the problem is understood and there isn't much for Sven or Ben or Brian to add. The problem is caused by underlying limitations in NTFS and the NT APIs you have to work with. I posted an outline of a possible (albeit nontrivial) solution elsewhere, of which they've almost certainly been aware since 2016 as well. There isn't anything new to "look into" here. |
Tell you what you can do, if you are looking for something constructive. Right now it is impossible to differentiate between |
@Coder-256 it's fixed in WSL2. |
For a good swath of folk's major pain points, yes. The underlying issue remains however. With minor string edits to OP:
and the strace for pedantry:
Which would seem academic, except that it affects everyone using WSL for WSL's raison d'être, which is interop with Windows. |
why is |
So it's an another facet of the same root problem as #873 and why WSL2 is being added.
It's not as much fixed in WSL2 as WSL2 is a fundamentally different tool with its own set of significant limitations (deriving from virtualization). Specifically, Hyper-V is not compatible with at least a couple of tools I use, and it also adds significant overhead by running Windows as a guest. |
A correct fix for WSL1 would require massive changes in the NT kernel to
accommodate WSL2. Fixing performance as well would essentially require
reimplementing much of the Linux I/O stack. The BSDs have code that could
be used, but it would still be a huge task.
|
Facing Similar Issue
Permissions
|
I have a hypothesis that may or may not help illuminate the issue. I have a machine that uses WSL v1 (no issues on v2). I can replicate EACCES pretty easily by trying to install a package with VSCode. I believe there's some sort of access limit or open handles. Basically, I know VSCode is monitoring the folder I'm working in. If I install a package with VSCode is open, I'll get EACCES errors. If I close VSCode and install the packages over the Ubuntu bash, no errors. So, to create this issue, I believe you just have to:
I think what's causing the EACCES issue is that VSCode isn't handling the file system changes faster than I'm sure somebody smarter than I can create some sort of test script that creates and deletes a bunch of files with a watcher in the background. You might find the collision. I just don't know where in WSL this issue might occur. Perhaps Edit: Worth noting, I don't use |
Retry rename on EACCES This PR retries `rename` upon getting `EACCES`. I've included data about how many retries are likely necessary. On my system (WSL1 Ubuntu 20.04, omitting hardware details), the `EACCES` issue makes it impossible to use esy to install any of the Dream examples or use Dream's quick start. As the data below shows, every installation is expected to fail with `EACCES`, if it is not worked around. The underlying `EACCES` issue seems to be a long-standing problem on WSL1 (microsoft/WSL#1529, microsoft/WSL#3395), and I think we do have to work around it in esy. I'm not sure what is causing the `EACCES` exactly. I think there are two main classes of possibilities: - Self-interaction between esy's opened file descriptors and `rename`. I think the self-interaction is due to WSL rather than Lwt or another library. Since I compiled esy on WSL, it is using Lwt's Unix (rather than Windows) C code. Since the Unix code seems to work fine on Linux and Mac, this suggests a WSL issue. - Interaction between esy and file indexers or other proceses running on the system. I'm not sure if that's a WSL issue or not, but I've never had to be aware of such processes when doing renames in Cygwin or elsewhere. I built esy with this patch under WSL and ran clean `esy install`s in Dream's [full-stack ReScript](https://github.com/aantron/dream/tree/03e4d37cb5f5f638707479cd46105e2ee2b1df0e/example/w-fullstack-rescript#readme) example, using this script: ```sh #!/bin/bash export PATH="/home/antron/code/attic/esy/_build/install/default/bin:$PATH" export ESY__PREFIX="/home/antron/code/dream/dream/example/w-fullstack-rescript/esy-prefix" export OCAMLRUNPARAM=b RUN=1 while true do rm -rf esy-prefix _esy esy.lock lib node_modules/ package-lock.json echo echo "RUN $RUN" which esy esy install # --verbosity debug if [ $? != 0 ] then exit fi RUN=$((RUN+1)) done ``` The example was checked out into NTFS. The system was freshly restarted, and VSCode (or anything similar) was not running. I used a version of this patch with a print showing the number of attempts before `rename` succeeds, and got the following results from 5 runs: ``` 1 attempt: 802 2 attempts: 52 3 attempts: 12 4 attempts: 3 5 attempts: 1 total: 870 ``` Based on this, I naively estimated that if a `rename` needs more than 1 attempt, the number of attempts needed decays by a factor of 4 at each step. I set the limit on the number of attempts naively to 8, thus expecting one failure in about 500 `esy install` attempts of the Dream ReScript example, under all these simplified assumptions. The delay between attempts is (over) one second, so this means that upon legitimate `EACCES`, users will have to wait eight seconds to get an error message. I think there are two ways to address this: - Fall back to recursive copy rather than retrying `rename` when `rename` fails. Do we have a recursive copy available in esy or its dependencies? Is it fine to leave the source directory intact? - Detect WSL and retry only on WSL. Waiting for 8 seconds is still a much better user experience than failure to install at all, so the PR will still be an improvement, without, in this case, harming Linux or Mac users. We could also add a message, shown in case we finally fail with `EACCES` on WSL, giving users a hint about potential VSCode or other watchers, and what else they can try to solve the problem. Closes #1363. Probably fixes #1097, some of the reports after the first one. Probably fixes #1083. Probably fixes #593, but I haven't looked into non-WSL Windows yet. Probably fixes aantron/dream#63. cc @bryphe, @rizo, @jordwalke, @iMplode-nZ, @a-c-sreedhar-reddy, @srirajshukla, @andreypopp
This issue has been automatically closed since it has not had any activity for the past year. If you're still experiencing this issue please re-file this as a new issue or feature request. Thank you! |
Pulling this out of #1492. It is possibly a dup of #1420; but I'm thinking maybe not because that issue has no lingering fds. This hits in spades with nfs-ganesha, because the protocol is stateless and file descriptors are cached for a while before being released. Windows Explorer traipses all over thousands of files, causing lots of directory fds to remain open. This in turn causes unexpected
mv
andrm
operations on seemingly random directories on the WSL side to fail.This will probably also show itself when people start doing web development scenarios, because http servers have a tendency to do the same sort of fd caching.
strace on WSL:
strace on Ubuntu:
The text was updated successfully, but these errors were encountered: