-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix of OutOfMemoryException when embedding large amount of files to binlog #8971
Conversation
Well, I missed somehow one failed test on my local machine. I will add at least one more commit, but the functionality is ready for review. EDIT: Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Please retarget to the vs17.7
branch.
eng/Versions.props
Outdated
@@ -2,7 +2,8 @@ | |||
<!-- Copyright (c) .NET Foundation and contributors. All rights reserved. Licensed under the MIT license. See License.txt in the project root for full license information. --> | |||
<Project> | |||
<PropertyGroup> | |||
<VersionPrefix>17.7.0</VersionPrefix><DotNetFinalVersionKind>release</DotNetFinalVersionKind> | |||
<VersionPrefix>17.7.0</VersionPrefix> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks wrong, both because you need to update the version prefix and because we generally put these two on the same line to force a merge conflict when flowing final version commits to main.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't matter for this PR because we'll be handling merging to the release branch specially since we have several things to bring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest to completely remove change to this file from this PR - in order to have a clean merge of all the in-flight 17.7 PRs (Rainer was contemplating single octopus merge of all 17.7 PRs)
@@ -226,12 +227,25 @@ public void Shutdown() | |||
|
|||
if (projectImportsCollector != null) | |||
{ | |||
projectImportsCollector.Close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this have to be up here instead of where it was before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that we first flush the remaining archive bits to disk in the temp folder and close the file, so that we can read it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Did you perfomr any measurments how this impacts build speed with binlog for small and mid-size projects? (I'd say dotnet new console
, dotnet new mvc
and OrchardCore can possibly be seen as representative samples)
If there is any measurable difference we should probably plan followup item where we optimize the approach to keep in-memory stream for sane projects. A sample of very primitive StreamProxy that dynamically uses either memory or fallbacks to file based on the size can be found here: https://github.com/dotnet/templating/blob/main/src/Microsoft.TemplateEngine.Core/Util/StreamProxy.cs
I expressed my concerns regarding to perf regression caused by switching back to temp file, but Kirill measured it and...
We can do our measurements, but I would say that most devs are not collecting binlog by default (during development). Binlogs are collected on dev machines usually when you want to diagnose some issue. |
Perfect! Sounds like we should be good |
Hello! I noticed that you're targeting one of our servicing branches. Please consider updating the version. |
Fixes #8595 by storing the embedded-file zip in a temporary directory (instead of memory or binlog target directory) to avoid problems with file watchers.
1773680
to
4cf11b4
Compare
Curious how did we miss #9057? Were the corrupt binlogs easy to repro? Are we missing a test? Did manual testing all look OK? |
They are easy to repro but the viewer is pretty tolerant of the problem; it loads without visible error and is just missing the embeds. I think I might have seen a log that was mysteriously missing embeds but I didn't connect it with this and I went back and forth between builds that did and didn't have the bug enough that I didn't see it coming. We definitely need better regression testing around this scenario, which is high on the list after figuring out how to get the fix out. |
Hmm, I've been thinking about surfacing first-chance exceptions in the viewer and flash some sort of user-visible warning, like a goldbar. This would have caught this. |
I suppose it doesn't help that the embeds archive is the last record in the file, so even if the stream is now corrupted and dis-synchronized the rest of the binlog is already read |
I'm focusing on the serialization problem and I will add test at least for the length of the payload. I believe I tried open binlogs in viewer and everything looked OK. Warning in binlog viewer would be nice feature. |
Work around bad binlogs produced by dotnet/msbuild#8971 by probing for the zip header that will be the known first bytes of the blob. Co-authored-by: MichalPavlik <mipavlik@microsoft.com>
I think a user-visible warning like this is great from the developer's (i.e., our) perspective but detrimental from a user's perspective. Most users won't know why it's flashing unless they intentionally look for an error afterwards, and even if they do, they probably won't know how to or have the inclination to fix it. I think this sort of things is great in a beta product to help get fast feedback but not ideal in a product as established as the binlog viewer, as most users at this point probably just want to finish their work and don't care if the viewer is broken unless it makes it harder for them to do so. |
Fixes #8595
Summary
Embedding of large amount of files to binlog can lead to OutOfMemoryException due to memory pressure. It also causes LOH fragmentation and Gen2 collections.
Customer Impact
Large builds cannot use the binary logger due to an OOM crash.
Regression?
Yes, in 16.8 (#5718).
Testing
Before:
After:
Risk
Medium--adds new filesystem dependencies which could fail, but is similar to the pre-16.8 mechanism and tries to be tolerant of failures.
Changes Made
This change reverts in-memory operations and uses temp file for embedded files. The file is now stored in temporary directory (instead of binlog target directory) to avoid problems with file watchers. There might be possible another optimizations, but goal of this PR is to unblock our partners. We can optimize further in separate PR.