Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ResourceContainer/AssemblyLoadEventHandler methods, remove allocs #9822

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

h3xds1nz
Copy link
Contributor

@h3xds1nz h3xds1nz commented Sep 20, 2024

Description

Optimizes two rather important startup methods in ResourceContainer class, the handler for runtime assembly load events - OnAssemblyLoadEventHandler and GetResourceManagerWrapper. This saves several milliseconds upon startup.

  • I retracted from using StringComparer.OrdinalIgnoreCase as it would be slower by ~60ns in the final solution.
  • For parsing the assembly name from safe input (RuntimeAssembly), I've re-used the Optimize retrieval of simple name from Assembly.FullName, fix up faulty methods #9739 and added a new method for Version/PublicKeyToken
  • I've originally used ArrayPool<char> for the unlikely event of needing an array (256+ chars) but seems redundant to initialize it this early, and we shall not really need to allocate an array anyways in 99% cases.
  • Using AlternativeLookup, this now becomes fully allocation-free given that usually only 1 entry is in the dictionary.
  • It is safe to assume that ToLowerInvariant produces same-length output, the own function's sanity check and several other dotnet runtime libraries (plus tons of user code) depend on this behavior of per-character folding.
  • Some of the comments (e.g. about GACed assemblies) could have been removed, but I've kept them in.
  • We seal the class for additional perf benefit as it is derived.

Do note that these benchmarks measure throughput and allocations (accurate), the savings are in milliseconds this early.
Allocations are dependent on presence of public key / assembly name length, this just illustrates general scenario.

Single throughput of AssemblyLoadHandler

Method assembly Mean [ns] Error [ns] StdDev [ns] Code Size [B] Gen0 Allocated [B]
Original Syste(...)7798e [89] 636.3 ns 6.57 ns 6.14 ns 5,880 B 0.0896 1504 B
PR__EDIT Syste(...)7798e [89] 121.6 ns 0.44 ns 0.39 ns 5,052 B - -

Standard app startup (28 libraries loaded)

Method assembly Mean [ns] Error [ns] StdDev [ns] Code Size [B] Gen0 Allocated [B]
Original Syste(...)7798e [89] 17,579.6 ns 151.20 ns 141.43 ns 8,030 B 2.6245 44352 B
PR__EDIT Syste(...)7798e [89] 3,395.5 ns 5.43 ns 4.81 ns 3,166 B - -

Customer Impact

Improved startup performance, decreased startup allocations.

Regression

No.

Testing

Local build, running few sample apps using resources and verifying functionality.

Risk

Low, I believe I didn't achieve any change of the behavior, even unintentionally.

Microsoft Reviewers: Open in CodeFlow

@h3xds1nz h3xds1nz requested review from a team as code owners September 20, 2024 09:29
@dotnet-policy-service dotnet-policy-service bot added PR metadata: Label to tag PRs, to facilitate with triage Community Contribution A label for all community Contributions labels Sep 20, 2024
Copy link

@edwardneal edwardneal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Does the benchmark handle the startup of an app with something bound to an embedded image resource? There's a call to Assembly.Load in the call stack of ResourceContainer.GetPart (GetPart -> GetPartCore -> GetResourceManagerWrapper -> BaseUriHelper.GetLoadedAssembly -> Assembly.Load) and it feels like this is going to be the largest performance killer in the worst-case scenario; I'm curious to see whether it's triggered under normal circumstances.

Comment on lines +276 to 277
Assembly assembly = BaseUriHelper.GetLoadedAssembly(assemblyName, assemblyVersion, assemblyToken);
if (assembly.Equals(Application.ResourceAssembly))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does speedscope highlight this as a hot path? BaseUriHelper.GetLoadedAssembly does quite a bit of work, and referencing a resource in an already-loaded assembly is quite common. If it's frequently called, comparing the assembly names might be faster than calling GetLoadedAssembly for every package part.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this is run only once and then stored in the cache of resource managers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speedscope, is that what the cool kids use now? Haha.

Anyways as @miloush says, this is usually only run once, you shall not make it past the lookup, I originally had string there for the key but changed it in the last commits as you rarely need to add an entry.

The method itself shall be optimized but I felt it doesn't need to be done in the scope of this PR, however any modification has to wait anyways. I currently have 60 open PRs and I'm waiting for them to be resolved so I can make more changes now because I'm effectively blocking myself in a lot of places to not overlap changes.

A bit frustrating but gotta live with the reality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I currently have 60 open PRs and I'm waiting for them to be resolved so I can make more changes now because I'm effectively blocking myself in a lot of places to not overlap changes.

A bit frustrating but gotta live with the reality.

Please post the PRs which are blocking (causing overlap), We will try to prioritize those. The next CTP has 5 of your PRs lined up. Agree with the sentiment that this is a slow pace and frustrating, however the test passes take up time, and with the .NET 9 releases and other deliverables, the max we can get is 8 PRs a month.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@singhashish-wpf would it be possible to run the tests on a batch of PRs together? Or perhaps @h3xds1nz could do a bigger PR with all the picked commits?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are going to batch in 10. The problem is we want the haystack to be smaller in case of test failures, otherwise we could have taken all of them together.

Copy link
Contributor Author

@h3xds1nz h3xds1nz Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@singhashish-wpf Thank you, that would be very lovely if we could prioritize. I understand the amount of tests run is rather large for WPF, it takes time (was it even a few days?) and you're trying to minimize the number so it is easier to pinpoint in case of encountering case failures. Nonetheless, I'm very glad you're taking the time for the discussion here.

As per the priorities, this would help a bunch:

  1. Reduce ThicknessConverter allocs to minimum and improve conversion performance #9363
  • This will unblock me on rather a big PR in terms of allocations with most converters (but small in code changes).
  1. Optimize conversion of MouseAction enum from/to string, remove allocations #9676 / Optimize conversion of Key enum from/to string, reduce allocations #9697 and Optimize conversion of ModifierKeys enum from/to string, reduce allocations #9683
  • This will unblock optimizations of Key/MouseGesture converters as a result.
  1. Replace Marshal.SizeOf(...) calls for primitive types with sizeof(...) #9220
  • I keep constantly hitting on code with changes I made here.
  1. Use MemoryExtensions.Split over string.Split to decrease allocations/improve perf #9600
  1. Improve start-up time by not figuring out NetFX version for AppContext switches #9711
  • Here I'd mostly like to know ahead if we're gonna go that direction, because then I could add the dead code removal that will end up as a result directly into the PR. The save of that is like 12ms during startup.
  1. Replace occurrences of params T[] with params ReadOnlySpan<T> where possible #9481
  • This would be nice to have faster for further optimizations on the call-chains I had in mind.
  1. Remove CriticalHandle properties #9292
  • This PR is from Thomas but would help a great deal to be squeezed in.
  1. Replace SecurityHelper.AreStringTypesEqual with string.Equals for code quality #9521
  • This directly makes a modification in the BaseUriHelper for example. The goal is to get rid of the CAS helpers as it should have been done few years ago before your team has stepped into WPF.

I'd also have one question - since I've been mostly writing tests for all the changes, should I be adding them in? My issue is that currently it is hard to not duplicate some tests from wpf-test (any movement on the intended move of the wpf-test repo into main repo as per the discussion in #9401?) and few assemblies don't even have Test projects, the assemblies would require InternalsVisibleTo as use of UnsafeAccessors/reflection ain't the best idea, I know some DRTs still depend on it though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @h3xds1nz for the list. I will get these added in the next CTP. cc:// @harshit7962

If you are writing unit tests, we still have some in the System.Xaml area and others, you can add them in this repo only. If you want to leverage the DRT setup, you can do it on wpf-test repo. We are looking for better ways to have a common place and an integrated testing infra. Tests at both the places are validated before PR merge, so as of now submitting at any place is fine. The gist of the discussion is hinting towards movement of the tests from wpf-test repo, which is going to take up time/effort, also we ideally don't want to keep following the old testing infra as that has execution challenges and learning curve for contributors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@singhashish-wpf Thanks a bunch. Yeah not keen on the DRT setup, thanks for the guidance regarding unit tests.

Happy to hear that things are moving in some direction :)

@JeremyKuhne
Copy link
Member

@adamsitnik wanted to point this out to you as it is doing some assembly name parsing in a hot path for WPF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Contribution A label for all community Contributions Included in test pass PR metadata: Label to tag PRs, to facilitate with triage Status:Completed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants