Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NativeAOT/ARM] Avoid generating linker range extension thunks #99208

Merged
merged 2 commits into from
Mar 11, 2024

Conversation

filipnavara
Copy link
Member

@filipnavara filipnavara commented Mar 3, 2024

Contributes to #99079
Ref: #99079 (comment)

Prevent the native linker from generating range extension thunks inside the managed code section. The unwinding code cannot deal with them and the short relocations limit the maximum code size.

@ghost
Copy link

ghost commented Mar 3, 2024

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Includes #99207, only the second commit belongs to this PR
Ref: #99079 (comment)

Assume everything but import thunks needs long jumps. Note that this doesn't necessarily handle import of methods that are also marked as RuntimeExport. These may still internally get resolved and cause the thunks to be produced.

Author: filipnavara
Assignees: -
Labels:

area-NativeAOT-coreclr

Milestone: -

return (ushort)RelocType.IMAGE_REL_BASED_THUMB_BRANCH24;
#else
var targetObject = HandleToObject(target);
if (targetObject is ExternSymbolNode)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I would love to use something like targetObject is ExternMethodSymbolNode or targetObject is ExternSymbolNode and not RuntimeImportMethodNode. Unfortunately, I cannot, because multiple C library calls are imported as InternalCall. That's the case at least for memmove and the float/double math functions like ceil.

@MichalStrehovsky
Copy link
Member

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@am11 am11 added the arch-arm32 label Mar 5, 2024
@filipnavara
Copy link
Member Author

Can I get a re-run, please? ;)

@MichalStrehovsky
Copy link
Member

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@filipnavara
Copy link
Member Author

Looks like a first clean pass of System.Runtime.Tests in a while. (Let's not jinx it, hopefully the Pri0 tests will pass as well.)

Once #99207 is merged I will rebase this and update the PR description.

@@ -4018,8 +4018,10 @@ private ushort getRelocTypeHint(void* target)
case TargetArchitecture.X64:
return (ushort)RelocType.IMAGE_REL_BASED_REL32;

#if READYTORUN
case TargetArchitecture.ARM:
Copy link
Member

@jkotas jkotas Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much perf does this leave on the table?

Would it be better to detect and skip the linker generated thunks in the code manager at runtime?

Copy link
Member Author

@filipnavara filipnavara Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the measurements for performance but there are several things to consider:

  • The thunks may appear in front of the managed code section or at the end. Either way it can only cover a limited range (+- 16MB) and we would still get a failure if this range is exceeded. Some of the tests are not far away from hitting such limit.
  • Detecting the thunks is non-trivial. The exception unwinding table is indexed only by method start address and does not contain the end address or method length. It's assumed that the next method follows the previous one. To detect thunks inserted at the beginning of the managed code section we could look for the lowest address of unwinding info within the section. Detecting thunks at the end of the managed code requires inserting an extra "cantunwind" marker at the end of our managed code to get an idea where the inserted thunks start (unless the linker moves them to a separate synthetic section which happened in my earlier experiment). The linker doesn't generate any unwinding table information for the thunks.
  • Inserting custom label symbols to mark the beginning+end of our managed code wouldn't work when using the multi-object-file mode. Likewise, in that mode the thunks could appear in the middle of the section when multiple object files are merged. That may still be solvable with the "cantunwind" end marker but it's noticably harder to reason about.
  • CoreCLR doesn't seem to send the relocation hint to JIT either. It doesn't ask for relocatable code so the sequence is slightly shorter (2 bytes iirc) but it still doesn't use the short BL as far as I can tell.

Given all the considerations I would prefer to have a correct baseline and optimize later.

@filipnavara filipnavara marked this pull request as ready for review March 5, 2024 20:17
@filipnavara filipnavara changed the title [NativeAOT/ARM] Avoid generating (most) thunks [NativeAOT/ARM] Avoid generating linker range extension thunks Mar 5, 2024
@MichalStrehovsky MichalStrehovsky merged commit 9c828a1 into dotnet:main Mar 11, 2024
110 checks passed
@filipnavara filipnavara deleted the nativeaot-looong-branch branch March 11, 2024 11:51
@github-actions github-actions bot locked and limited conversation to collaborators Apr 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants