-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HostWriter: ResourceUpdate fails non-deterministically #3832
Comments
@peterhuene can you take a look? |
0x8007006E is cannot open the file specified. Not entirely sure if that's because it doesn't exist or a sharing violation. My guess would be on the latter since it's intermittent (perhaps something external to the build is interfering?). |
We have a lot of tests that create and customize the apphost on Windows and I haven't seen flaky CI with this. I'll attempt to repro it. |
Anti-virus? |
I ran @DamianEdwards are you seeing this on a particular machine or multiple machines? |
@peterhuene was seeing this on only one machine so far, but haven't tried to repro it on my other machine. |
@DamianEdwards I'm still not able to reproduce this. The API call is As I don't have any MSIT-controlled Windows machines (and it's been a decade since I last did), I don't know what we're using for antivirus on corpnet these days. Would it be possible to check if its the real-time protection of the antivirus software that's interfering with the build? |
I'm updating the |
This commit implements a parameterized retry count for creating the apphost. Like the `Copy` task from MSBuild, the `CreateAppHost` task now takes a parameter to specify the number of retries and the delay between retries (in milliseconds) to perform if the creation fails. This should help alleviate build failures when external processes are locking the intermediate apphost during a build (especially on Windows). Fixes devdiv#950462. Fixes dotnet/cli#11650.
This commit implements a parameterized retry count for creating the apphost. Like the `Copy` task from MSBuild, the `CreateAppHost` task now takes a parameter to specify the number of retries and the delay between retries (in milliseconds) to perform if the creation fails. This should help alleviate build failures when external processes are locking the intermediate apphost during a build (especially on Windows). Fixes devdiv#950462. Fixes dotnet/cli#11650.
I have a repro too during app building on blazor server app, but not consistent. I ran
|
Would it be possible to use procmon to see if we can figure out who is opening the file? I've never been able to reproduce the issue in a VM. |
good point, let me try several times more |
This commit implements a parameterized retry count for creating the apphost. Like the `Copy` task from MSBuild, the `CreateAppHost` task now takes a parameter to specify the number of retries and the delay between retries (in milliseconds) to perform if the creation fails. This should help alleviate build failures when external processes are locking the intermediate apphost during a build (especially on Windows). Fixes devdiv#950462. Fixes dotnet/cli#11650.
For preview 9, we'll retry the apphost creation as a mitigation for these file locking issues. I'm going to leave this issue open for now to see if we can get some idea of which external process is locking the apphost. |
Spend another half of an hour trying to repro, see once again, but forget to open procmon :( start to think it is anti virus scan -- once it is scanned once, it no longer does that anymore. |
@peterhuene I looked a bit more at the ResourceUpdater code in HostModel -- wrt whether/where retries should be added to ResouceUpdater code. Looks like there are only two places where a file is opened: Both the above functions take in a path and obtain a In particular, in the callstack described above, we see that the So, I don't think adding retries is the correct fix in this case. |
There is one scenario I can see where we may end up in this case: If we call The |
The error codes for lock-violation is |
@peterhuene @wli3 |
Reopened based on discussion above suggesting there's still something else here |
I'm getting this/a similar error with an ASP.NET Core 3 Preview-7 project, when I first opened it with Preview-8 in Visual Studio 2019: 1>------ Rebuild All started: Project: TechmemeRiver, Configuration: Release Any CPU ------
1>You are using a preview version of .NET Core. See: https://aka.ms/dotnet-core-preview
1>
1>Bundler: Cleaning output from bundleconfig.json
1>Bundler: Done cleaning output file from bundleconfig.json
1>
1>Bundler: Begin processing bundleconfig.json
1> Minified wwwroot/css/site.min.css
1> Minified wwwroot/js/site.min.js
1>Bundler: Done processing bundleconfig.json
1>C:\Program Files\dotnet\sdk\3.0.100-preview8-013656\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.targets(359,5): error MSB4018: The "CreateAppHost" task failed unexpectedly.
1>C:\Program Files\dotnet\sdk\3.0.100-preview8-013656\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.targets(359,5): error MSB4018: Microsoft.NET.HostModel.ResourceUpdater+HResultException: 80070005
1>C:\Program Files\dotnet\sdk\3.0.100-preview8-013656\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.targets(359,5): error MSB4018: at Microsoft.NET.HostModel.ResourceUpdater.Update()
1>C:\Program Files\dotnet\sdk\3.0.100-preview8-013656\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.targets(359,5): error MSB4018: at Microsoft.NET.HostModel.AppHost.HostWriter.CreateAppHost(String appHostSourceFilePath, String appHostDestinationFilePath, String appBinaryFilePath, Boolean windowsGraphicalUserInterface, String assemblyToCopyResorcesFrom)
1>C:\Program Files\dotnet\sdk\3.0.100-preview8-013656\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.targets(359,5): error MSB4018: at Microsoft.NET.Build.Tasks.CreateAppHost.ExecuteCore() in /_/src/Tasks/Microsoft.NET.Build.Tasks/CreateAppHost.cs:line 35
1>C:\Program Files\dotnet\sdk\3.0.100-preview8-013656\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.targets(359,5): error MSB4018: at Microsoft.NET.Build.Tasks.TaskBase.Execute() in /_/src/Tasks/Common/TaskBase.cs:line 47
1>C:\Program Files\dotnet\sdk\3.0.100-preview8-013656\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.targets(359,5): error MSB4018: at Microsoft.Build.BackEnd.TaskExecutionHost.Microsoft.Build.BackEnd.ITaskExecutionHost.Execute()
1>C:\Program Files\dotnet\sdk\3.0.100-preview8-013656\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.targets(359,5): error MSB4018: at Microsoft.Build.BackEnd.TaskBuilder.<ExecuteInstantiatedTask>d__26.MoveNext()
1>Done building project "TechmemeRiver.csproj" -- FAILED.
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ========== Edit 1: After doing a "Clean" and "Rebuild", the error does not occur anymore: 1>------ Rebuild All started: Project: TechmemeRiver, Configuration: Release Any CPU ------
1>You are using a preview version of .NET Core. See: https://aka.ms/dotnet-core-preview
1>
1>Bundler: Cleaning output from bundleconfig.json
1>Bundler: Done cleaning output file from bundleconfig.json
1>
1>Bundler: Begin processing bundleconfig.json
1> Minified wwwroot/css/site.min.css
1> Minified wwwroot/js/site.min.js
1>Bundler: Done processing bundleconfig.json
1>TechmemeRiver -> E:\Dropbox\Beruf\Prog\TechmemeRiver\Source\TechmemeRiver\bin\Release\netcoreapp3.0\TechmemeRiver.dll
1>TechmemeRiver -> E:\Dropbox\Beruf\Prog\TechmemeRiver\Source\TechmemeRiver\bin\Release\netcoreapp3.0\TechmemeRiver.Views.dll
========== Rebuild All: 1 succeeded, 0 failed, 0 skipped ========== I hope, it stays this way. |
@peterhuene any ideas? |
No new theories other than something like an antivirus program is interfering with writing out the resources. I've been unable to reproduce at all, but I don't use any such programs that might interfere in this manner. The above is a generic access denied error. I still believe that retry at the point of resource updater interaction in Ultimately I hope a future version of the SDK opens the apphost just once for customization, but that would require getting rid of the resource API usage (something we want to do to increase supported platforms for the operation anyway). |
This still happens randomly on release bits. Usually on build either through vs or if dotnet tool tries to compile assembly. Never happened twice in a row though. |
I didn’t realize there was 500 retries possible. |
Yes it is 500 retries with 100 ms wait between each. |
Ok then it makes sense to filter the known useless-to-retry cases |
I think I’m OK with this plan if its lifecycle is limited to past releases and we do the better thing in 5 |
That's my thought @nguerrera. |
The other option is to retry after the known reported cases of 0x6E (Open failed) and 0x05 (Acces Denied). |
This change attempts to fix a non-deterministic customer reported failure. Several customers have observed failure during resource update when the HostModel updates the AppHost (to transfer resources from the managed app). The failure is not detereminisitc, not reproducible on our machines, and depends on specific computers/software running. This indicates interference by other software while the HostWriter is updating the AppHost. The current implementation retries the resource update if an update because the device or drive is locked (say by an antivurus) HRESULT 0x21 and 0x6C. However, the failures reported have errors 0x5 (Access violation) and 0x6# (Open failed). Windows/Defender team said that file-locking with these error-codes is not expected. However, different AVs work differently about examining files. We believe that the correct fix for this issue is to complete: To implement dotnet#3828 and dotnet#3829 Ship AppHost with an extension/permissions not indicating an executable. However the above is a fairly large change for servicing .net core 3.1. So, this change implements a simpler fix intended for servicing 3.1 branch: Always retry the resource-update on Win32 error, unless the failure is a knwon irrecoverable code (listed a few error codes relevent to File IO). This change may cause unnecessary retries on legitimate failures (about 50 seconds). But such cases are rare, because the SDK supplies the apphost, and the HostModel itself creates the file to update. Fixes dotnet#3832
This change attempts to fix a non-deterministic customer reported failure. Several customers have observed failure during resource update when the HostModel updates the AppHost (to transfer resources from the managed app). The failure is not detereminisitc, not reproducible on our machines, and depends on specific computers/software running. This indicates interference by other software while the HostWriter is updating the AppHost. The current implementation retries the resource update if an update because the device or drive is locked (say by an antivurus) HRESULT 0x21 and 0x6C. However, the failures reported have errors 0x5 (Access violation) and 0x6# (Open failed). Windows/Defender team said that file-locking with these error-codes is not expected. However, different AVs work differently about examining files. We believe that the correct fix for this issue is to complete: To implement #3828 and #3829 Ship AppHost with an extension/permissions not indicating an executable. However the above is a fairly large change for servicing .net core 3.1. So, this change implements a simpler fix intended for servicing 3.1 branch: Always retry the resource-update on Win32 error, unless the failure is a knwon irrecoverable code (listed a few error codes relevent to File IO). This change may cause unnecessary retries on legitimate failures (about 50 seconds). But such cases are rare, because the SDK supplies the apphost, and the HostModel itself creates the file to update. Fixes #3832
Is there a date when this will be released. |
looks like I found the prime suspect (at least in my case). It seems like Dropbox is interfering with the filesystem in a big way and therefore with |
@robkuz I think this fix will make it into the April servicing release. |
dotnet/runtime#3832 Building a WinExe with resources fails non-deterministically Several customers have observed failure during resource update when the HostModel updates the AppHost (to transfer resources from the managed app). The failure is not detereminisitc, not reproducible on our machines, and depends on specific computers/software running. This indicates interference by other software while the HostWriter is updating the AppHost. The current implementation retries the resource update if an update because the device or drive is locked (say by an antivurus) HRESULT 0x21 and 0x6C. However, the failures reported have errors 0x5 (Access violation) and 0x6# (Open failed). Windows/Defender team said that file-locking with these error-codes is not expected. However, different AVs work differently about examining files. We believe that the correct fix for this issue is to complete: * To implement dotnet#3828 and dotnet#3829 * Ship AppHost with an extension/permissions not indicating an executable. However the above is a fairly large change for servicing .net core 3.1. So, this change implements a simpler fix intended for servicing 3.1 branch: Always retry the resource-update on Win32 error. While this may cause unnecessary retries on legitimate failures (ex: file missing) such cases are rare, because the SDK supplies the apphost, and the HostModel itself creates the file to update. Low dotnet/runtime#32347
dotnet/runtime#3832 Building a WinExe with resources fails non-deterministically Several customers have observed failure during resource update when the HostModel updates the AppHost (to transfer resources from the managed app). The failure is not detereminisitc, not reproducible on our machines, and depends on specific computers/software running. This indicates interference by other software while the HostWriter is updating the AppHost. The current implementation retries the resource update if an update because the device or drive is locked (say by an antivurus) HRESULT 0x21 and 0x6C. However, the failures reported have errors 0x5 (Access violation) and 0x6# (Open failed). Windows/Defender team said that file-locking with these error-codes is not expected. However, different AVs work differently about examining files. We believe that the correct fix for this issue is to complete: * To implement dotnet#3828 and dotnet#3829 * Ship AppHost with an extension/permissions not indicating an executable. However the above is a fairly large change for servicing .net core 3.1. So, this change implements a simpler fix intended for servicing 3.1 branch: Always retry the resource-update on Win32 error. While this may cause unnecessary retries on legitimate failures (ex: file missing) such cases are rare, because the SDK supplies the apphost, and the HostModel itself creates the file to update. Low dotnet/runtime#32347
I know this issue has been closed // a fix is planned. Thanks all, this resolved my issues 😄 |
dotnet/runtime#3832 Building a WinExe with resources fails non-deterministically Several customers have observed failure during resource update when the HostModel updates the AppHost (to transfer resources from the managed app). The failure is not detereminisitc, not reproducible on our machines, and depends on specific computers/software running. This indicates interference by other software while the HostWriter is updating the AppHost. The current implementation retries the resource update if an update because the device or drive is locked (say by an antivurus) HRESULT 0x21 and 0x6C. However, the failures reported have errors 0x5 (Access violation) and 0x6# (Open failed). Windows/Defender team said that file-locking with these error-codes is not expected. However, different AVs work differently about examining files. We believe that the correct fix for this issue is to complete: * To implement #3828 and #3829 * Ship AppHost with an extension/permissions not indicating an executable. However the above is a fairly large change for servicing .net core 3.1. So, this change implements a simpler fix intended for servicing 3.1 branch: Always retry the resource-update on Win32 error. While this may cause unnecessary retries on legitimate failures (ex: file missing) such cases are rare, because the SDK supplies the apphost, and the HostModel itself creates the file to update. Low dotnet/runtime#32347
I'm getting the same error or another saying the cloud operation failed when project is hosted in a OneDrive folder. |
|
Folks are still reporting this. We shouldn't be relying on retry as a solution here. There is a correct way for build tools to make file changes: only ever write using the very first created handle. Once a file is closed, only ever open for read. If you need to write again, make a copy maintaining the open handle and write to that handle. In this way you can never be broken by AV-interference. |
I believe this has been fixed by #48774 |
Steps to reproduce
dotnet new webapp
dotnet build
Expected behavior
No errors
Actual behavior
Randomly see the following error (twice in the last 20 builds just now):
Environment data
dotnet --info
output:The text was updated successfully, but these errors were encountered: