Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cached interface dispatch for coreclr #111771

Merged

Conversation

davidwrighton
Copy link
Member

@davidwrighton davidwrighton commented Jan 24, 2025

Enabling cached interface dispatch as an options for CoreCLR (should reduce memory usage/remove RWX pages, at the cost of reducing performance)

  • Current implementation is only enabled in release builds for Apple platforms with restrictions on code generation
  • On Debug/Checked builds of X64/Arm64 platforms it is possible to enable the feature by setting the DOTNET_UseCachedInterfaceDispatch environment variable to 1. (NOTE: Enabling this feature requires running on a processor which supports 128 bit compare and swap, which has implications on Linux X64 builds, and would have implications for Loongarch/RiscV if we enable the code there.)
  • The strategy is to re-use the existing VirtualCallStubManager infrastructure for all non-code-generation driven lookups, but to replace the stub generation logic with the CachedInterfaceDispatch paths from NativeAOT.
  • In addition, to support this, we need to extend the size of a Dispatch cell embedded in R2R images, so various parts of that logic are now capable of generating double pointer aligned dispatch cells when commanded. Infrastructure to set the right behavior for targetting apple platforms has not yet been implemented although the general purpose support is in place.

Known issues addressed before making a non-draft PR

  • Env flag for swapping between cached interface dispatch and VSD when both features are enabled in the code
  • Testing of normal scenarios
  • Testing of diagnostic scenarios (No regressions in VSD mode, Cached Interface Dispatch also successfully steps)
  • Consider enabling the resolve cache for cached interface dispatch scenarios (will not do until perf results show that the current scheme is slow)
  • Enable support for R2R with cached interface dispatch
  • Make the Indirection cell size 2 pointers instead of 4
  • Free dispatch cache infrastructure on collectible assembly destruction
  • Allocate cache chunks with LoaderHeap instead of malloc
  • Move PalInterlockedCompareExchange128 to the PAL or minipal
  • Actually correct construction of indirection cell for virtual dispatch on vtables

Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

…te that this requires adding the -mcx16 switch to clang, so that cmpxchg16b instruction gets generated, which is an increase in the baseline CPU required by CoreCLR on Linux, and isn't likely to be OK for shipping publicly
…veAOT cached interface dispatch implementation (as it isn't actually used)

Update IsIPinVirtualStub to check the AVLocations, not the stub entry points
@davidwrighton davidwrighton reopened this Jan 24, 2025
- Enable generating double pointer indirection cells in R2R files using
  command line switch.
- Fix VTableOffset calculation
- Add logic in ExternalMethodFixupWorker to handle the double pointer
  indirection cells.
}

MethodDesc *pTargetMD = COMDelegate::GetMethodDescForOpenVirtualDelegate(delegateObj);
pSDFrame->SetFunction(pTargetMD);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If pSDFrame is &frame why are we indirecting through the pointer instead of using frame directly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to historical reasons. Until recently (3 weeks ago), the frame was FrameWithCookie<StubDispatchFrame> and not just StubDispatchFrame. So the StubDispatchFrame * pSDFrame = &frame was used so that on every access we don't need to do casting. It isn't needed anymore.

@davidwrighton davidwrighton merged commit 9296e04 into dotnet:main Mar 5, 2025
118 of 120 checks passed
@am11
Copy link
Member

am11 commented Mar 5, 2025

@davidwrighton is it supposed to exclude riscv64 and loongarch64? Test build has started to break:

ld.lld : error : undefined symbol: RhpVTableOffsetDispatchAVLocation [/runtime/src/tests/nativeaot/GenerateUnmanagedEntryPoints/GenerateUnmanagedEntryPoints.csproj] [/runtime/src/tests/build.proj]

@MichalStrehovsky
Copy link
Member

@davidwrighton is it supposed to exclude riscv64 and loongarch64? Test build has started to break:

ld.lld : error : undefined symbol: RhpVTableOffsetDispatchAVLocation [/runtime/src/tests/nativeaot/GenerateUnmanagedEntryPoints/GenerateUnmanagedEntryPoints.csproj] [/runtime/src/tests/build.proj]

I think I have a fix in #113179

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants