Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RAR] Don't do I/O on SDK-provided references #8688

Merged
merged 7 commits into from
May 5, 2023

Conversation

ladipro
Copy link
Member

@ladipro ladipro commented Apr 21, 2023

Fixes #8634

Context

For SDK/workload references, the SDK is currently already passing relevant metadata such as AssemblyVersion and PublicKeyToken, so there is no need for RAR to open these files and parse their .NET metadata tables to get this information. Also, barring corrupted installs, these files are guaranteed to exist on disk so there is no need for RAR to do any I/O on them (file-exists, get-time-stamp, ..)

Changes Made

Tweaked RAR to trust the item metadata passed by the SDK. The feature is gated on the presence of the FrameworkReferenceName metadatum which is not documented and is used only by the relevant SDK tasks, to my best knowledge.

SDK does not specify the Culture component of assembly names so it's assumed to be neutral. This has passed all relevant validation so far. If non-neutral assemblies are ever used as references, we'll need to define a metadatum for SDK to set.

Testing

  • Existing and new unit tests.
  • Experimental insertion with an assert that assembly names calculated from item metadata are equivalent to those extracted from assembly files (verifies that the Culture=neutral assumption is correct).
  • Checked all assemblies specified in all FrameworkList.xml files shipped in the SDK (+ workloads) and verified that all of them are culture neutral.

Perf results

RAR micro-benchmark where the task is invoked with parameters corresponding to building a simple AspNetCore app:

  • 13.13 ms -> 3.27 ms per invocation without StateFile cache.
  • 15.03 ms -> 5.13 ms per invocation with StateFile cache.

RAR task duration as reported with /clp:PerformanceSummary when building a simple AspNetCore app with MSBuild server enabled:

  • 20 ms -> 10 ms.

Notes

  • The change is behind a 17.8 changewave.
  • No changes have been made to per-project caching (via the StateFile parameter) to reduce scope. Changes to the per-project cache file will be done in another PR.

@ladipro ladipro marked this pull request as ready for review April 21, 2023 10:00
@rokonec
Copy link
Member

rokonec commented Apr 27, 2023

There were concern about this changes and Runtime build
I have tried to build Runtime repo on Linux with no issues. I have tested just development inner loop build.
Building and testing on CI might still surface some issues, after insertion of these changes and its propagation to Installer, we shall do experimental insertion into Runtime CI.

Copy link
Member

@Forgind Forgind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really excited to see this in action!

src/Framework/ChangeWaves.cs Outdated Show resolved Hide resolved
src/Framework/ChangeWaves.cs Outdated Show resolved Hide resolved
@@ -48,6 +49,7 @@ internal static class AssemblyResolution
string sdkName,
string rawFileNameCandidate,
bool isPrimaryProjectReference,
bool isImmutableFrameworkReference,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're currently saying this is only for immutable framework references, but do you think we might extend that later? If so, maybe we should use a somewhat more generic name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say what's interesting is not the immutability, but that we can assume its dependency closure is provided by the framework so don't need to explore it--right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair...fullyUnderstoodReferences? Better name with that intent?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say what's interesting is not the immutability, but that we can assume its dependency closure is provided by the framework so don't need to explore it--right?

Not quite right. The property of "dependency closure is provided" is signaled with the ExternallyResolved metadatum. It is a pre-existing behavior that for such assemblies we skip the dependency walk. It is happening here:

if (!reference.ExternallyResolved || FindDependenciesOfExternallyResolvedReferences)
{
// Look for dependent assemblies.
if (_findDependencies)
{
FindDependenciesAndScatterFiles(reference, newEntries);
}
}

This new flag means "the file is guaranteed to exist and never change" and in the Resolver hierarchy it's used only in RawFilenameResolver to skip the file existence check.

As for the name, I was on the fence between "immutable" and "framework" so I used both 😀

src/Tasks/AssemblyDependency/ReferenceTable.cs Outdated Show resolved Hide resolved
src/Framework/ChangeWaves.cs Outdated Show resolved Hide resolved
@@ -48,6 +49,7 @@ internal static class AssemblyResolution
string sdkName,
string rawFileNameCandidate,
bool isPrimaryProjectReference,
bool isImmutableFrameworkReference,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say what's interesting is not the immutability, but that we can assume its dependency closure is provided by the framework so don't need to explore it--right?

src/Tasks/AssemblyDependency/ReferenceTable.cs Outdated Show resolved Hide resolved
@ladipro
Copy link
Member Author

ladipro commented May 4, 2023

I have addressed feedback and run another experimental insertion. It flagged several major improvements in, quite surprisingly, the add-new-project scenario (-30% reference set, for example).

@rokonec noticed that we still open SDK assemblies to read their metadata version ("runtime version"), which doesn't look necessary. I made GetAssemblyRuntimeVersion return the hardcoded v4.0.30319 to remove this I/O, with the following justification:

  • None of the assemblies currently distributed with the SDK/workloads appear to have anything else than this canonical version.
  • Daniel is not aware of an issue with this assumption.
  • The metadata parsing code has a bug that prevents it from detecting WinRT/winmd files, which would have been the only thing potentially broken by the change ([Bug]: Core version of RAR fails to read WinRT runtime version from assemblies #8731). I.e. it doesn't work anyways.

@Forgind @rainersigwald, please review. Merging this PR will unblock further RAR perf work.

@Forgind Forgind added the merge-when-branch-open PRs that are approved, except that there is a problem that means we are not merging stuff right now. label May 5, 2023
@Forgind Forgind merged commit 280393a into dotnet:main May 5, 2023
Forgind pushed a commit that referenced this pull request May 9, 2023
Now that #8688 is in, we no longer get value from the SDK pre-cache. The cost of reading the cache is >50 ms per project, which typically makes it the most expensive part of RAR.

Trace from MSBuild command line build:
@ladipro ladipro deleted the rar-no-sdk-io branch May 10, 2023 06:52
JaynieBai pushed a commit that referenced this pull request Jun 12, 2023
Contributes to #8635

Context
With #8688 we no longer need to cache data on immutable framework files on disk as obtaining this information is cheap and does not require I/O.

Changes Made
Made SystemState (the RAR cache) not add such files to the per-instance dictionary. This dictionary is serialized to disk as <project-file>.AssemblyReference.cache so the change effectively makes the cache file smaller, saving on serialization and deserialization time.

Also refactored DeserializeCache into a generic method instead of taking a Type parameter.

For a simple ASP.NET Core app with one project reference and one package reference, the size of the cache file is reduced from 142 kB to 2 kB. Projects with no references other than the SDK will not have the cache file created at all.

Testing
Existing unit tests, manual measurements.

Notes
The actual "Don't read the file if the data is already in memory" change will be in a separate upcoming PR.
ladipro added a commit that referenced this pull request Jul 10, 2023
### Context

Low-hanging fruit is showing in RAR performance profiles.

### Changes Made

1. Avoided constructing `AssemblyName` on a hot path as the constructor makes expensive Fusion calls on .NET Framework. The problematic code was introduced in #8688.

2. Added a metadata bulk-set operation to the internal `IMetadataContainer` interface. Calling `SetMetadata` for more than a couple of metadata is slow if `ImmutableDictionary` is used as its backing storage. RAR is heavy on metadata manipulation and switching to the new operation saves about 10% of RAR run-time when building OrchardCore. 

### Testing

Existing and new unit tests. Measured the perf impact by building OC.

---------

Co-authored-by: Rainer Sigwald <raines@microsoft.com>
YuliiaKovalova pushed a commit to YuliiaKovalova/msbuild that referenced this pull request Jul 18, 2023
### Context

Low-hanging fruit is showing in RAR performance profiles.

### Changes Made

1. Avoided constructing `AssemblyName` on a hot path as the constructor makes expensive Fusion calls on .NET Framework. The problematic code was introduced in dotnet#8688.

2. Added a metadata bulk-set operation to the internal `IMetadataContainer` interface. Calling `SetMetadata` for more than a couple of metadata is slow if `ImmutableDictionary` is used as its backing storage. RAR is heavy on metadata manipulation and switching to the new operation saves about 10% of RAR run-time when building OrchardCore. 

### Testing

Existing and new unit tests. Measured the perf impact by building OC.

---------

Co-authored-by: Rainer Sigwald <raines@microsoft.com>
JaynieBai pushed a commit that referenced this pull request Jul 27, 2023
#8916 broke some .NET Framework scenarios and was reverted. This PR is a redo of the change with the bug fixed. It turned out that the destination of the optimized CopyMetadataTo may be a transparent proxy, typically a TaskItem object living in another appdomain, which does not work well with Linq. The fix and the test coverage are in their own commits.

Context
Low-hanging fruit is showing in RAR performance profiles.

Changes Made
Avoided constructing AssemblyName on a hot path as the constructor makes expensive Fusion calls on .NET Framework. The problematic code was introduced in [RAR] Don't do I/O on SDK-provided references #8688.

Added a metadata bulk-set operation to the internal IMetadataContainer interface. Calling SetMetadata for more than a couple of metadata is slow if ImmutableDictionary is used as its backing storage. RAR is heavy on metadata manipulation and switching to the new operation saves about 10% of RAR run-time when building OrchardCore.

Testing
Existing and new unit tests. Measured the perf impact by building OC.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-branch-open PRs that are approved, except that there is a problem that means we are not merging stuff right now. performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RAR: Obtain assembly names from the SDK
4 participants