-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On Windows, System.IO.Directory.Delete() intermittently fails quietly, is unexpectedly asynchronous #27958
Comments
/cc @JeremyKuhne |
We're sort of stuck here due to the nature of file deletion on Windows as described. I know that Windows is looking at changing the deletion behavior to immediately remove files from the namespace on deletion. It is a tricky thing to do as there is almost certainly code that actually depends on the current Windows behavior. I believe we'd be more likely to introduce performance or reliability regressions by attempting to rename than we would be to fix problems. Additionally, if the Windows change makes it in this will be moot. (Possibly in the Spring update?) |
I've validated the new deletion behavior is in Windows 10 1903. Filenames are immediately freed when delete is called. Closing as that gets rid of the underlying problem here. |
I realize this issue has been closed but I am having the same issue on 1903. var source = GetSource();
var target = GetTarget();
if (Directory.Exists(target))
{
Directory.Delete(target, true);
}
Directory.Move(source, target); The code above fails when moving the directory if the process is run in release mode (or debug mode without a debugger attached) and throws the following exception: $TARGET$ below is just for demonstration purposes.
If I make the thread sleep for a few seconds after calling Is there a suggested workaround from the corefx team? .NET Core SDK (reflecting any global.json):
Version: 3.0.100
Commit: 04339c3a26
Runtime Environment:
OS Name: Windows
OS Version: 10.0.18362
OS Platform: Windows
RID: win10-x64
Base Path: C:\Program Files\dotnet\sdk\3.0.100\
Host (useful for support):
Version: 3.0.0
Commit: 7d57652f33
.NET Core SDKs installed:
2.2.402 [C:\Program Files\dotnet\sdk]
3.0.100 [C:\Program Files\dotnet\sdk]
.NET Core runtimes installed: Microsoft.AspNetCore.All 2.1.13 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
Microsoft.AspNetCore.All 2.2.7 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
Microsoft.AspNetCore.App 2.1.13 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 2.2.7 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 3.0.0 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.NETCore.App 2.1.13 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.2.7 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 3.0.0 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.WindowsDesktop.App 3.0.0 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App] |
Agreed, @CoskunSunali; I can confirm that both (PowerShell-based) tests in the OP still eventually fail. Pasting the functions in a PowerShell window and then running
@JeremyKuhne: Any thoughts? Does the WinAPI now officially guarantee synchronous behavior and did the implementation fall short, or is there something else going on? |
Fundamentally it's a bad idea to reuse paths right after you delete them if you can help it. If you can't, you really need to add some sort of retry logic. One other possible thing to try is moving the directory to a temporary name and deleting that. While Windows has been making changes it is only in the most recent versions and they haven't As I mentioned above there really isn't anything .NET can do to mitigate this as we don't have the context to know we should retry. Anything "smart" we would try and do would likely introduce more issues than we would resolve. :/ cc: @carlossanlop |
Understood re not wanting to fix this in .NET. At the WinAPI level It would be nice to know when this will eventually be fixed reliably - your comment that accompanied closing this issue mistakenly gave the impression that this has already happened. A link to an official promise / announcement / roadmap - if there is such a thing - would be great, so users can reliably tell when it will become safe to rely on code that assumes synchronous deletion.
It's only a bad idea with a filesystem that behaves asynchronously, and it is the latter I would consider a bad idea. |
I don't think there is one and I can't find one. :(
Bad idea or not, that is the way NTFS has worked for 30 years. Some people actually rely on it working that way as a type of locking mechanism. :) Even outside of NTFS, delete/recreate isn't the most reliable pattern and should be avoided when it isn't strictly needed as it can cause grief with multiple instances. Quite often I see this pattern with temporary files- which are much better suited to unique names. |
Note:
Starting with Windows 10 version |
Note:
cmd.exe
'srd /s
and PowerShell'sRemove-Item
are equally affected: see here and here.Note: The problem occurs only on Windows (the problem also affects the "full" .NET Framework there, albeit with slightly different symptoms).
The Windows API functions
DeleteFile()
andRemoveDirectory()
functions are inherently asynchronous (emphasis added):System.IO.Directory.Delete()
fails to account for this asynchronous behavior, which has two implications:Problem (a): Trying to delete a nonempty directory (which invariably requires recursive deletion of its content first) can fail - infrequently, but it does happen, and the failure is quiet.
Problem (b): Trying to recreate a successfully deleted directory or a file immediately afterwards may fail - intermittently (easier to provoke than the other problem, but a more exotic use case overall).
Problem (a) is due to using depth-first recursion without accounting for the asynchronous deletion behavior; the problem, along with the solution, is described in this YouTube video (starts at 7:35).
Note that the source code does at least hint at the asynchronous behavior:
https://github.com/dotnet/corefx/blob/40364e539572e9dad9c8a2eb165fc9af28e5664a/src/System.IO.FileSystem/src/System/IO/FileSystem.Windows.cs#L532-L534
However, in practice the recursive removal of content does fail intermittently - and quietly - leaving both the target directory and remnants inside it behind.
Important:
$HOME/tmpDir
- remove it manually afterwards, if still present.Steps to reproduce
Setup:
Problem (a):
Assert-ReliableDirRemoval .net
Problem (b):
Assert-SyncDirRemoval .net
Expected behavior
The functions should loop indefinitely (terminate them with
Ctrl+C
), emitting a.
in each iteration.Actual behavior
Eventually - and that there is no predictable time frame is indicative of the problem - an error will occur, on the order of minutes or even longer.
Problem (a):
That is, recursive removal of the target dir's content failed quietly due to async timing issues, and both the target directory and remnants inside it linger.
Problem (b):
That is, recreating the target dir failed, because the prior removal hadn't yet completed.
Aux. function definitions:
Functions
Assert-ReliableDirRemoval
andAssert-SyncDirRemoval
You can paste the entire code at the prompt in order to define these functions.
Environment data
The text was updated successfully, but these errors were encountered: