-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single file: Guard against partial cleanup of extracted files #32649
Conversation
This change fixes dotnet#3778 This change is mainly targeted to be servicing fix for .net core 3.1. When executing single-file apps, if a pre-existing extraction exists, the contents of the extraction directory are now verified. If files are missing, they are recovered. **Extraction Algorithm** `ExtractionDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<bundle-id>` `WorkingDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<process-id>` If `ExtractionDir` does not exist, then Create `WorkingDir`, and extract bundle contents within it Attempt to rename `WorkingDir` as `ExtractionDir` If the rename succeeds, continue execution of the app If not, another process completed extraction; Verify and reuse this extraction (as in step 2). If `ExtractionDir` exists, then verify that it contains all files listed in the manifest. If all contents are available, Remove `WorkingDir` Continue execution of the app, reusing the contents. If certain files are missing within `ExtractionDir`, then For each missing file, do the following individually Extract the files within `WorkingDir` Create sub-paths within `ExtractionDir` if necessary Rename the file from `WorkingDir/path/<file>` to `ExtractionDir/path/<file>` unless `ExtractionDir/path/<file>` exists (extracted there by another process in the meantime) Remove `WorkingDir` Continue execution of the app. All of the renames above are done with appropriate retries to circumvent interference from anti-virus apps. **Startup time impact** * Console HelloWorld execution time: * Framework dependent app: Windows/Linux No measurable difference * Self-contained app: Windows: ~10ms additional * Self-contained app: Linux: No measurable difference * Greenshot Startup: * Self-contained Windows: No noticiable/measurable difference * NugetPackageExplorer Startup: * Self-contained Windows: No noticiable/measurable difference
pal::string_t file_path = ext_dir; | ||
append_path(&file_path, entry.relative_path().c_str()); | ||
|
||
if (!pal::file_exists(file_path)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I experimented using readdir()
/ FindFirstFile()/FindNextFile()
APIs in place of file_exists()
checks on each file.
That is, create a dictionary of files in the the extraction location (and its subdirectories), and check that it has all entries in the manifest.
However, there wasn't a significant difference in startup time due to the change -- especially since the startup impact is already small. Given this and the fact that .net 5 will focus on moving away from extraction, I kept the file_exists checks to keep the logic simple.
if (should_retry) | ||
{ | ||
trace::info(_X("Retrying Rename [%s] to [%s] due to EACCES error"), old_name.c_str(), new_name.c_str()); | ||
pal::sleep(100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop will run for at most 500 times, waiting at most 50s before trying. This is also called by commit_file(), so this can take up to 50s per file in the bundle. It looks to me that this is a bit too much...
There might be other files in the bundle that need renaming, so maybe instead of retrying them here, add it to a queue to try later?
Maybe keep track of the total amount of time taken to rename these things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lpereira I agree that the worst case looks bad -- but it is extremely unlikely.
The long wait time can only happen if the rename succeeds close to the 500th iteration for each file. That is, when there is a real failure with EACCESS - the extraction aborts after the first file (after 50s).
The case when recovery is triggered is rare, and is not a path to be optimized. If the AV blocks for a long time after writing each file, then the recovery will need to wait for it. Still, it is not expected to take the full minute after each file is written.
While the wait time could be used to write other files that need to be recovered, the time gained is rather negligible, since the time to write these files is a few milliseconds, compared to the minute spent waiting for the AV.
I actually tried out writing the change you suggested. Please take a look at this diff. But I'm not sure it is work to take in that complexity in a servicing fix, given that this case is unlikely, and that such a case is hard to test. If you still think we need to take the change, I can add it on to the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a servicing fix and this case is rare, we can keep the current behaviour from this PR. However, keep the code from this diff somewhere, should we need it for the next servicing release (it looks pretty clean and makes the worst case scenario more manageable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lpereira.
Do you have any further comments/concerns on this PR?
Please help add exactly one area label if the bot does not. It teaches the bot. |
… files ** Issue dotnet/runtime#3778 ** Customer Scenario Single-file apps rely on extractiong their contents to disk. The extaction is performed to a temp-directory by default, which may be cleaned up by the OS. The app can re-extract itself if the entire extraction is cleaned up -- but not partial deletions. Customers have noticed that for some apps, the extracted files were removed, but the extraction directory itself wasn't. This causes the app to fail to start. ** Problem When executing single-file apps, if a pre-existing extraction exists, it is assumed to be valid. ** Solution When executing single-file apps, if a pre-existing extraction exists, the contents of the extraction directory are now verified. If files are missing, they are recovered. **Extraction Algorithm** `ExtractionDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<bundle-id>` `WorkingDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<process-id>` If `ExtractionDir` does not exist, then Create `WorkingDir`, and extract bundle contents within it Attempt to rename `WorkingDir` as `ExtractionDir` If the rename succeeds, continue execution of the app If not, another process completed extraction; Verify and reuse this extraction (as in step 2). If `ExtractionDir` exists, then verify that it contains all files listed in the manifest. If all contents are available, Remove `WorkingDir` Continue execution of the app, reusing the contents. If certain files are missing within `ExtractionDir`, then For each missing file, do the following individually Extract the files within `WorkingDir` Create sub-paths within `ExtractionDir` if necessary Rename the file from `WorkingDir/path/<file>` to `ExtractionDir/path/<file>` unless `ExtractionDir/path/<file>` exists (extracted there by another process in the meantime) Remove `WorkingDir` Continue execution of the app. All of the renames above are done with appropriate retries to circumvent interference from anti-virus apps. ** Startup time impact * Console HelloWorld execution time: * Framework dependent app: Windows/Linux No measurable difference * Self-contained app: Windows: ~10ms additional * Self-contained app: Linux: No measurable difference * Greenshot Startup: * Self-contained Windows: No noticiable/measurable difference * NugetPackageExplorer Startup: * Self-contained Windows: No noticiable/measurable difference ** Risk Medium. The change is well scoped, but isn't trivial. It affects all single-file apps. ** Master Branch dotnet/runtime#32649
… files ** Issue dotnet/runtime#3778 ** Customer Scenario Single-file apps rely on extractiong their contents to disk. The extaction is performed to a temp-directory by default, which may be cleaned up by the OS. The app can re-extract itself if the entire extraction is cleaned up -- but not partial deletions. Customers have noticed that for some apps, the extracted files were removed, but the extraction directory itself wasn't. This causes the app to fail to start. ** Problem When executing single-file apps, if a pre-existing extraction exists, it is assumed to be valid. ** Solution When executing single-file apps, if a pre-existing extraction exists, the contents of the extraction directory are now verified. If files are missing, they are recovered. **Extraction Algorithm** `ExtractionDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<bundle-id>` `WorkingDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<process-id>` If `ExtractionDir` does not exist, then Create `WorkingDir`, and extract bundle contents within it Attempt to rename `WorkingDir` as `ExtractionDir` If the rename succeeds, continue execution of the app If not, another process completed extraction; Verify and reuse this extraction (as in step 2). If `ExtractionDir` exists, then verify that it contains all files listed in the manifest. If all contents are available, Remove `WorkingDir` Continue execution of the app, reusing the contents. If certain files are missing within `ExtractionDir`, then For each missing file, do the following individually Extract the files within `WorkingDir` Create sub-paths within `ExtractionDir` if necessary Rename the file from `WorkingDir/path/<file>` to `ExtractionDir/path/<file>` unless `ExtractionDir/path/<file>` exists (extracted there by another process in the meantime) Remove `WorkingDir` Continue execution of the app. All of the renames above are done with appropriate retries to circumvent interference from anti-virus apps. ** Startup time impact * Console HelloWorld execution time: * Framework dependent app: Windows/Linux No measurable difference * Self-contained app: Windows: ~10ms additional * Self-contained app: Linux: No measurable difference * Greenshot Startup: * Self-contained Windows: No noticiable/measurable difference * NugetPackageExplorer Startup: * Self-contained Windows: No noticiable/measurable difference ** Risk Medium. The change is well scoped, but isn't trivial. It affects all single-file apps. ** Master Branch dotnet/runtime#32649
… files ** Issue dotnet/runtime#3778 ** Customer Scenario Single-file apps rely on extractiong their contents to disk. The extaction is performed to a temp-directory by default, which may be cleaned up by the OS. The app can re-extract itself if the entire extraction is cleaned up -- but not partial deletions. Customers have noticed that for some apps, the extracted files were removed, but the extraction directory itself wasn't. This causes the app to fail to start. ** Problem When executing single-file apps, if a pre-existing extraction exists, it is assumed to be valid. ** Solution When executing single-file apps, if a pre-existing extraction exists, the contents of the extraction directory are now verified. If files are missing, they are recovered. **Extraction Algorithm** `ExtractionDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<bundle-id>` `WorkingDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<process-id>` If `ExtractionDir` does not exist, then Create `WorkingDir`, and extract bundle contents within it Attempt to rename `WorkingDir` as `ExtractionDir` If the rename succeeds, continue execution of the app If not, another process completed extraction; Verify and reuse this extraction (as in step 2). If `ExtractionDir` exists, then verify that it contains all files listed in the manifest. If all contents are available, Remove `WorkingDir` Continue execution of the app, reusing the contents. If certain files are missing within `ExtractionDir`, then For each missing file, do the following individually Extract the files within `WorkingDir` Create sub-paths within `ExtractionDir` if necessary Rename the file from `WorkingDir/path/<file>` to `ExtractionDir/path/<file>` unless `ExtractionDir/path/<file>` exists (extracted there by another process in the meantime) Remove `WorkingDir` Continue execution of the app. All of the renames above are done with appropriate retries to circumvent interference from anti-virus apps. ** Startup time impact * Console HelloWorld execution time: * Framework dependent app: Windows/Linux No measurable difference * Self-contained app: Windows: ~10ms additional * Self-contained app: Linux: No measurable difference * Greenshot Startup: * Self-contained Windows: No noticiable/measurable difference * NugetPackageExplorer Startup: * Self-contained Windows: No noticiable/measurable difference ** Risk Medium. The change is well scoped, but isn't trivial. It affects all single-file apps. ** Master Branch dotnet/runtime#32649
… files ** Issue dotnet/runtime#3778 ** Customer Scenario Single-file apps rely on extractiong their contents to disk. The extaction is performed to a temp-directory by default, which may be cleaned up by the OS. The app can re-extract itself if the entire extraction is cleaned up -- but not partial deletions. Customers have noticed that for some apps, the extracted files were removed, but the extraction directory itself wasn't. This causes the app to fail to start. ** Problem When executing single-file apps, if a pre-existing extraction exists, it is assumed to be valid. ** Solution When executing single-file apps, if a pre-existing extraction exists, the contents of the extraction directory are now verified. If files are missing, they are recovered. **Extraction Algorithm** `ExtractionDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<bundle-id>` `WorkingDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<process-id>` If `ExtractionDir` does not exist, then Create `WorkingDir`, and extract bundle contents within it Attempt to rename `WorkingDir` as `ExtractionDir` If the rename succeeds, continue execution of the app If not, another process completed extraction; Verify and reuse this extraction (as in step 2). If `ExtractionDir` exists, then verify that it contains all files listed in the manifest. If all contents are available, Remove `WorkingDir` Continue execution of the app, reusing the contents. If certain files are missing within `ExtractionDir`, then For each missing file, do the following individually Extract the files within `WorkingDir` Create sub-paths within `ExtractionDir` if necessary Rename the file from `WorkingDir/path/<file>` to `ExtractionDir/path/<file>` unless `ExtractionDir/path/<file>` exists (extracted there by another process in the meantime) Remove `WorkingDir` Continue execution of the app. All of the renames above are done with appropriate retries to circumvent interference from anti-virus apps. ** Startup time impact * Console HelloWorld execution time: * Framework dependent app: Windows/Linux No measurable difference * Self-contained app: Windows: ~10ms additional * Self-contained app: Linux: No measurable difference * Greenshot Startup: * Self-contained Windows: No noticiable/measurable difference * NugetPackageExplorer Startup: * Self-contained Windows: No noticiable/measurable difference ** Risk Medium. The change is well scoped, but isn't trivial. It affects all single-file apps. ** Master Branch dotnet/runtime#32649
… files ** Issue dotnet/runtime#3778 ** Customer Scenario Single-file apps rely on extractiong their contents to disk. The extaction is performed to a temp-directory by default, which may be cleaned up by the OS. The app can re-extract itself if the entire extraction is cleaned up -- but not partial deletions. Customers have noticed that for some apps, the extracted files were removed, but the extraction directory itself wasn't. This causes the app to fail to start. ** Problem When executing single-file apps, if a pre-existing extraction exists, it is assumed to be valid. ** Solution When executing single-file apps, if a pre-existing extraction exists, the contents of the extraction directory are now verified. If files are missing, they are recovered. **Extraction Algorithm** `ExtractionDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<bundle-id>` `WorkingDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<process-id>` If `ExtractionDir` does not exist, then Create `WorkingDir`, and extract bundle contents within it Attempt to rename `WorkingDir` as `ExtractionDir` If the rename succeeds, continue execution of the app If not, another process completed extraction; Verify and reuse this extraction (as in step 2). If `ExtractionDir` exists, then verify that it contains all files listed in the manifest. If all contents are available, Remove `WorkingDir` Continue execution of the app, reusing the contents. If certain files are missing within `ExtractionDir`, then For each missing file, do the following individually Extract the files within `WorkingDir` Create sub-paths within `ExtractionDir` if necessary Rename the file from `WorkingDir/path/<file>` to `ExtractionDir/path/<file>` unless `ExtractionDir/path/<file>` exists (extracted there by another process in the meantime) Remove `WorkingDir` Continue execution of the app. All of the renames above are done with appropriate retries to circumvent interference from anti-virus apps. ** Startup time impact * Console HelloWorld execution time: * Framework dependent app: Windows/Linux No measurable difference * Self-contained app: Windows: ~10ms additional * Self-contained app: Linux: No measurable difference * Greenshot Startup: * Self-contained Windows: No noticiable/measurable difference * NugetPackageExplorer Startup: * Self-contained Windows: No noticiable/measurable difference ** Risk Medium. The change is well scoped, but isn't trivial. It affects all single-file apps. ** Master Branch dotnet/runtime#32649
@Mgamerz the change is approved for servicing but it not yet merged: dotnet/core-setup#9013 I believe it will be available in the May release. |
Thanks for fixing this! I deal with this issue right now and it took me a while to figure out where the files get extracted to. Deleting the temp folder fixed it for me but I cannot expect this from customers obviously. |
… files (#9013) ** Issue dotnet/runtime#3778 ** Customer Scenario Single-file apps rely on extractiong their contents to disk. The extaction is performed to a temp-directory by default, which may be cleaned up by the OS. The app can re-extract itself if the entire extraction is cleaned up -- but not partial deletions. Customers have noticed that for some apps, the extracted files were removed, but the extraction directory itself wasn't. This causes the app to fail to start. ** Problem When executing single-file apps, if a pre-existing extraction exists, it is assumed to be valid. ** Solution When executing single-file apps, if a pre-existing extraction exists, the contents of the extraction directory are now verified. If files are missing, they are recovered. **Extraction Algorithm** `ExtractionDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<bundle-id>` `WorkingDir` = `$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<process-id>` If `ExtractionDir` does not exist, then Create `WorkingDir`, and extract bundle contents within it Attempt to rename `WorkingDir` as `ExtractionDir` If the rename succeeds, continue execution of the app If not, another process completed extraction; Verify and reuse this extraction (as in step 2). If `ExtractionDir` exists, then verify that it contains all files listed in the manifest. If all contents are available, Remove `WorkingDir` Continue execution of the app, reusing the contents. If certain files are missing within `ExtractionDir`, then For each missing file, do the following individually Extract the files within `WorkingDir` Create sub-paths within `ExtractionDir` if necessary Rename the file from `WorkingDir/path/<file>` to `ExtractionDir/path/<file>` unless `ExtractionDir/path/<file>` exists (extracted there by another process in the meantime) Remove `WorkingDir` Continue execution of the app. All of the renames above are done with appropriate retries to circumvent interference from anti-virus apps. ** Startup time impact * Console HelloWorld execution time: * Framework dependent app: Windows/Linux No measurable difference * Self-contained app: Windows: ~10ms additional * Self-contained app: Linux: No measurable difference * Greenshot Startup: * Self-contained Windows: No noticiable/measurable difference * NugetPackageExplorer Startup: * Self-contained Windows: No noticiable/measurable difference ** Risk Medium. The change is well scoped, but isn't trivial. It affects all single-file apps. ** Master Branch dotnet/runtime#32649
With this fix, will it be possible for windows to partially clean up the extracted files while the app is still running, or does dot net guard against that somehow? |
@swaroop-sridhar Looking at this diff, it seems like there are no guards for partial cleanup while the app is running -- leading to potential crashes when the application attempts to load a DLL. We have deployed a WPF application using publish single file to a large number of people and have seen crashes that have occurred after app-start when the app tries to load dlls in certain code paths. Since we are experiencing unacceptable and ever-increasing crash rates (as more and more users get into bad states) we have to remediate this now; so we are going to use a launcher app that sets |
@DanielRBaird @josephdwyer : This fix only re-extracts files at startup. If the app is very long-running (typically several months) and needs to guard against missing files while running, I'd recommend setting DOTNET_BUNDLE_EXTRACT_BASE_DIR to a non-temp directory (as noted by @josephdwyer). .net 5 will fix this issue and enable single-file apps to run without the need for extracting files to disk. Thanks. |
@swaroop-sridhar thanks for the response, we will proceed with the launcher app. I just wanted to point out that this issue seems to occur much more quickly than ~months; ~15% of our users have run into issues with this within 2 weeks. Some users run into the issue multiple times after clearing the temp directory (presumably disk pressure feeds into how often windows will attempt to clear the temp directories). I am not sure what the breakdown is of crashes on start vs while running, but we have certainly seen some while the app was running -- just wanted to let you know in case this affects the priority of fixing that situation. |
This change fixes #3778
This change is mainly targeted to be servicing fix for .net core 3.1.
When executing single-file apps, if a pre-existing extraction exists, the contents of the extraction directory are now verified.
If files are missing, they are recovered.
Extraction Algorithm
ExtractionDir
=$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<bundle-id>
WorkingDir
=$DOTNET_BUNDLE_EXTRACT_BASE_DIR/<app>/<process-id>
If
ExtractionDir
does not exist, thenCreate
WorkingDir
, and extract bundle contents within itAttempt to rename
WorkingDir
asExtractionDir
If the rename succeeds, continue execution of the app
If not, another process completed extraction; Verify and reuse this extraction (as in step 2).
If
ExtractionDir
exists, then verify that it contains all files listed in the manifest.If all contents are available,
Remove
WorkingDir
Continue execution of the app, reusing the contents.
If certain files are missing within
ExtractionDir
, thenFor each missing file, do the following individually
Extract the files within
WorkingDir
Create sub-paths within
ExtractionDir
if necessaryRename the file from
WorkingDir/path/<file>
toExtractionDir/path/<file>
unlessExtractionDir/path/<file>
exists (extracted there by another process in the meantime)Remove
WorkingDir
Continue execution of the app.
All of the renames above are done with appropriate retries to circumvent interference from anti-virus apps.
Startup time impact