Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit GC info into a COMDAT section #83371

Merged
merged 4 commits into from
Apr 25, 2023

Conversation

MichalStrehovsky
Copy link
Member

Saves 3.14% on BasicWebApi. 🤯

We run linker with COMDAT folding enabled. Generating this data into COMDAT foldable section allows linker to deduplicate these.

Cc @dotnet/ilc-contrib

Saves 3.14% on BasicWebApi. 🤯

We run linker with COMDAT folding enabled. Generating this data into COMDAT foldable section allows linker to deduplicate these.
@ghost
Copy link

ghost commented Mar 14, 2023

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Saves 3.14% on BasicWebApi. 🤯

We run linker with COMDAT folding enabled. Generating this data into COMDAT foldable section allows linker to deduplicate these.

Cc @dotnet/ilc-contrib

Author: MichalStrehovsky
Assignees: -
Labels:

area-NativeAOT-coreclr

Milestone: -

ObjectNodeSection section = ObjectNodeSection.XDataSection;
if (ShouldShareSymbol(node))
section = GetSharedSection(section, _sb.ToString());
ObjectNodeSection section = GetSharedSection(ObjectNodeSection.XDataSection, _sb.ToString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like LLVM defines a number of different SelectionKinds for COMDAT sections, but I can't see where define any. The LLVM ones look like this:

any
exactMatch
largest
noDuplicates
sameSize

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason I'm asking is because it sounds like we would probably only want "any" or "noDuplicates"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears to do the right thing on Windows, but I haven't checked Linux (hoped I would see it in the CI results).

If it doesn't work on Linux, it's likely because of #77491. I'd back out the non-Windows portion of this change and file a bug to reap this benefit outside Windows in 8.0 (it would be more work than just a one-liner that I randomly tried).

@MichalStrehovsky
Copy link
Member Author

Yeah, looks like it doesn't work on Linux.

Before:

/datadisks/disk1/work/A211088F/p/nativeaottest.sh /datadisks/disk1/work/A211088F/w/B4D609BE/e/nativeaot/SmokeTests/HardwareIntrinsics/X64Baseline/ X64Baseline.dll ''
****************************************************
* Size test                                        *
* Size of the executable is   1,916 kB             *
****************************************************

After:

/datadisks/disk1/work/A63308E9/p/nativeaottest.sh /datadisks/disk1/work/A63308E9/w/AEDE099C/e/nativeaot/SmokeTests/HardwareIntrinsics/X64Baseline/ X64Baseline.dll ''
****************************************************
* Size test                                        *
* Size of the executable is   1,916 kB             *
****************************************************

For comparison, on Windows:

call C:\h\w\B28909D9\p\nativeaottest.cmd C:\h\w\B28909D9\w\C78F0A4F\e\nativeaot\SmokeTests\HardwareIntrinsics\X64Baseline\ X64Baseline.dll 
****************************************************
* Size test                                        *
* Size of the executable is   1,835 kB             *
****************************************************
call C:\h\w\B6CE09A2\p\nativeaottest.cmd C:\h\w\B6CE09A2\w\A54E08F8\e\nativeaot\SmokeTests\HardwareIntrinsics\X64Baseline\ X64Baseline.dll 
****************************************************
* Size test                                        *
* Size of the executable is   1,809 kB             *
****************************************************

@MichalStrehovsky
Copy link
Member Author

Rolled back the Linux portion and file #83375 for tracking :(.

@MichalStrehovsky
Copy link
Member Author

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@am11
Copy link
Member

am11 commented Mar 14, 2023

Rolled back the Linux portion and file #83375 for tracking :(.

Seems like -Wl,--icf=all and -Wl,--icf=safe etc. do shave off 10.24 KB from hello-world-app on linux-arm64 (using daily build). Unfortunately, bfd linker doesn't support --icf option yet, but LLVM linker -fuse-ld=lld (activated by -p:UseLlvmLinker=true) and Gold linker -fuse-ld=gold support it. Gold linker is better on linux anyway; it's faster, consumes less memory and generates slimmer apps (hello-world-app published with ld.bfd (the default) on linux-arm64: 1918984 bytes while ld.gold: 1853632 bytes). We should probably prefer gold anyway, when it's available.

Introspection in Microsoft.NETCore.Native.Unix.targets will look like this:

  <![CDATA[
  Testing:
  
    if ! cc -fuse-ld=gold -Wl,--icf=all 2>&1 | grep -q unrecognized; then
        echo "yes, gold linker is available"
    fi
  ]]>

  <Exec Command="! &quot;$(CppLinker)&quot; -fuse-ld=gold -Wl,--icf=all 2&gt;&amp;1 | grep -q unrecognized" 
        Condition="'$(UseLlvmLinker)' != 'true' and '$(_targetOS)' == 'linux'"
        IgnoreExitCode="true" 
        StandardOutputImportance="Low">
    <Output TaskParameter="ExitCode" PropertyName="_GoldLinkerSupportedExitCode" />
  </Exec>

  <ItemGroup>
    <LinkerArg Include="-fuse-ld=gold" Condition="'$(_GoldLinkerSupportedExitCode)' == '0'" />
    <LinkerArg Include="-Wl,--icf=all" Condition="'$(_GoldLinkerSupportedExitCode)' == '0' or '$(UseLlvmLinker)' == 'true'" />
  </ItemGroup>

@MichalStrehovsky
Copy link
Member Author

Seems like -Wl,--icf=all and -Wl,--icf=safe etc. do shave off 10.24 KB from hello-world-app on linux-arm64 (using daily build)

Thanks for testing it out! I think this is only folding the native code within the executable. We don't generate code into separate sections unless -p:IlcFoldIdenticalMethodBodies=true was specified. This optimization produces non-sensical callstacks so we're unlikely to enable it by default. It's a good addition, but it's unlikely to help more than those 10 kB even for larger apps.

Looks like what we want is -Wl,--icf-data. But it doesn't seem to have an effect, likely due to #77491 (we mess up something in object writing).

Introspection in Microsoft.NETCore.Native.Unix.targets will look like this:

Is this something you'd be willing to contribute? I just tried -fuse-ld=gold on BasicMinimalApi on Ubuntu 18 and just this one switch without anything else reduced the size from 11.5 MB to 10.8 MB (🤯).

@am11
Copy link
Member

am11 commented Mar 15, 2023

This optimization produces non-sensical callstacks so we're unlikely to enable it by default. It's a good addition, but it's unlikely to help more than those 10 kB even for larger apps.

Ah, I haven't looked too closely. I found the difference only with -p:UseLlvmLinker=true (fuse-ld=lld which is selected by default in NativeAOT BuildIntegration on FreeBSD). From the binutils ticket:

The lld's safe ICF works with an LLVM feature. Since mid-2018, LLVM emits a
`.llvm_addrsig` section to all object files by default. That section contains
symbol indices whose addresses are taken. Using this table, lld can merge
functions more aggressively than gold can do.

Perhaps we would see more differences with lld if we start emitting this section? Other linkers apparently will optimize using this section in the future and we will automatically opt-in without additional work (bfd: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105625, mold: rui314/mold#484).

Is this something you'd be willing to contribute? I just tried -fuse-ld=gold on BasicMinimalApi on Ubuntu 18 and just this one switch without anything else reduced the size from 11.5 MB to 10.8 MB (🤯).

Sounds great, I will take a look. 🙂

@MichalStrehovsky
Copy link
Member Author

@dotnet/ilc-contrib could someone have a look? This is a good saving on Windows. We'll have to investigate how to translate this to Linux but that's tracked in #83375. It's likely object writer work around COMDATs.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@MichalStrehovsky MichalStrehovsky merged commit 9a693e0 into dotnet:main Apr 25, 2023
@MichalStrehovsky MichalStrehovsky deleted the gcinfofold branch April 25, 2023 06:24
@ghost ghost locked as resolved and limited conversation to collaborators May 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants