Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that the new Vector512 related ISAs are covered by ILLink Substitutions #83040

Merged
merged 2 commits into from
Mar 8, 2023

Conversation

tannergooding
Copy link
Member

@tannergooding tannergooding commented Mar 6, 2023

Now that the foundational PRs are in, they'll be several people iterating on adding Vector512 related intrinsics concurrently.

Having these classes defined up front helps reduce merge conflicts and other issues that would otherwise arise.

This makes progress towards #73604, #74813, and #76579

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the linkable-framework Issues associated with delivering a linker friendly framework label Mar 6, 2023
@ghost ghost assigned tannergooding Mar 6, 2023
@ghost
Copy link

ghost commented Mar 6, 2023

Tagging subscribers to 'linkable-framework': @eerhardt, @vitek-karas, @LakshanF, @sbomer, @joperezr
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author: tannergooding
Assignees: -
Labels:

linkable-framework

Milestone: -

@tannergooding
Copy link
Member Author

Just looking at "current" System.Private.Corelib, I'm seeing that X86Base.IsSupported isn't being folded...

For example, BitOperations.Log2 is defined as:

// The 0->0 contract is fulfilled by setting the LSB to 1.
// Log(1) is 0, and setting the LSB for values > 1 does not change the log2 result.
value |= 1;

// value    lzcnt   actual  expected
// ..0001   31      31-31    0
// ..0010   30      31-30    1
// 0010..    2      31-2    29
// 0100..    1      31-1    30
// 1000..    0      31-0    31
if (Lzcnt.IsSupported)
{
    return 31 ^ (int)Lzcnt.LeadingZeroCount(value);
}

if (ArmBase.IsSupported)
{
    return 31 ^ ArmBase.LeadingZeroCount(value);
}

if (WasmBase.IsSupported)
{
    return 31 ^ WasmBase.LeadingZeroCount(value);
}

// BSR returns the log2 result directly. However BSR is slower than LZCNT
// on AMD processors, so we leave it as a fallback only.
if (X86Base.IsSupported)
{
    return (int)X86Base.BitScanReverse(value);
}

// Fallback contract is 0->0
return Log2SoftwareFallback(value);

This is being folded down to:

value |= 1U;
if (Lzcnt.IsSupported)
  return 31 ^ (int) Lzcnt.LeadingZeroCount(value);
if (true) { } // ArmBase.IsSupported
if (true) { } // WasmBase.IsSupported
return X86Base.IsSupported ? (int) X86Base.BitScanReverse(value) : BitOperations.Log2SoftwareFallback(value);

Note that X86Base.IsSupported is not itself a "constant true" even though its "always available". This is because we don't correspondingly have an ILLink.Substituations.X86.xml (which would cause X86Base, Sse, Sse2, and Vector128 to all fold to constant "true"), is that right?

@tannergooding
Copy link
Member Author

Also noticed that HardwareIntrinsicsHelpers.Aot uses the same path for X86 and X64: https://github.com/dotnet/runtime/blob/main/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/HardwareIntrinsicHelpers.Aot.cs#L67-L71

However, XArchIntrinsicConstants.FromInstructionSet itself is only handling X64_* and not the X86_*. Should this be split into X86IntrinsicConstants and X64InstrinsicConstants?

CC. @MichalStrehovsky on this and the immediate above.

@tannergooding
Copy link
Member Author

Also noting we have a src/mono/wasm/build/ILLink.Substitutions.WasmIntrinsics.xml and src/mono/wasm/build/ILLink.Substitutions.NoWasmIntrinsics.xml

They're in a separate location so they're easy to miss. It looks like they are being properly handled/processed however.

@MichalStrehovsky
Copy link
Member

Note that X86Base.IsSupported is not itself a "constant true" even though its "always available". This is because we don't correspondingly have an ILLink.Substituations.X86.xml (which would cause X86Base, Sse, Sse2, and Vector128 to all fold to constant "true"), is that right?

I think so. Alternatively, we could have the IsSupported implemented as a constant true in the C# source code instead of the recursive call; that would work too. It's the reason why WasmBase gets stubbed out even though there's no substitution file for NoWasm (we have ILLink.Substitutions.NoArmIntrinsics.xml and ILLink.Substitutions.NoX86Intrinsics.xml, but no NoWasmIntrinsics.xml - I assume for historical reasons?).

However, XArchIntrinsicConstants.FromInstructionSet itself is only handling X64_* and not the X86_*. Should this be split into X86IntrinsicConstants and X64InstrinsicConstants?

IIRC these have the same underlying numerical constants so it doesn't matter. Either way X86 doesn't work with NativeAOT. We might as well delete the case statement.

@tannergooding
Copy link
Member Author

I think so. Alternatively, we could have the IsSupported implemented as a constant true in the C# source code instead of the recursive call; that would work too. It's the reason why WasmBase gets stubbed out even though there's no substitution file for NoWasm (we have ILLink.Substitutions.NoArmIntrinsics.xml and ILLink.Substitutions.NoX86Intrinsics.xml, but no NoWasmIntrinsics.xml - I assume for historical reasons?).

@MichalStrehovsky Hmmm, we actually can't do this, unfortunately...

Basically while it is part of the baseline and "always supported", we also have a release mode config knob (DOTNET_EnableSSE2=0 for example) which allows users to disable the support.

Such knobs exist so that users can more easily test their various software fallback paths in addition to the intrinsic paths.

Replacing the method with returning => true (either in source or via IL substitutions) means that indirect calls to the IsSupported property (delegates, reflection, immediate window or other debug calls, etc) will report true when the actual code path may report false.

This is somewhat unfortunate as it means we can't actually trim as much as we'd like while still maintaining accuracy in the exposed APIs. Is there any other feature that might make it feasible to allow trimming here for crossgen2 (where such knobs don't exist) while still allowing indirect queries to work as expected?

@tannergooding
Copy link
Member Author

tannergooding commented Mar 7, 2023

CC. @dotnet/jit-contrib, @dotnet/avx512-contrib

Should be ready for review.

@MichalStrehovsky
Copy link
Member

Is there any other feature that might make it feasible to allow trimming here for crossgen2 (where such knobs don't exist) while still allowing indirect queries to work as expected?

Crossgen2 should "trim" it based on my mental model. Crossgen2 currently pregenerates all public APIs in an assembly and privates are generated as-needed. So when crossgen2 compiles a method body, it will let RyuJIT do the substitution for IsSupported and that might eliminate the basic block plus native code for whatever private methods get called from that block. I don't see a way to trim beyond that if we want to keep the ability to turn off the baseline intrinsics.

@tannergooding
Copy link
Member Author

Thanks! I think this should be generally ready for review then.

CC. @dotnet/jit-contrib, @dotnet/avx512-contrib

@@ -5210,7 +5274,7 @@ public new abstract partial class X64 : System.Runtime.Intrinsics.X86.Sse3.X64
public static new bool IsSupported { get { throw null; } }
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) was this generated with the tool? Some classes have a blank line between them, and some don't.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but also not entirely...

Several of the corelib facades have bits that weren't done by the tool (often attributes or simple ordering things) and which make generating this "properly" result in large diffs.

So, I typically cut out just the new bits from the generated file and paste them into the checked in copy to avoid that issue.

We should probably do the work to ensure they are in sync again, but it hasn't bubbled up in priority yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public static new bool IsSupported { get => IsSupported; }

[Intrinsic]
public new abstract class VL : Avx512F.VL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the public nested types need doc comments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it looks like "all" of them are currently missing for System.Runtime.Intrinsics.*.

I'll log an issue and gets this handled in a follow up PR later tonight.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -21,6 +24,42 @@
<type fullname="System.Runtime.Intrinsics.X86.Avx2/X64">
<method signature="System.Boolean get_IsSupported()" body="stub" value="false" />
</type>
<type fullname="System.Runtime.Intrinsics.X86.Avx512BW">
Copy link
Member

@eerhardt eerhardt Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need ILLink.Substitution entries for these when we also have the .PlatformNotSupported.cs versions that just return false? This xml file seems to be included in the same condition as when those .cs files are used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure actually. Maybe some legacy thing from back when they weren't guaranteed false?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like I originally did it in ee12b54, but looking back I'm not sure it is needed. Since we are just following existing convention here, let's not block this PR on it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ILLink changes LGTM.

@tannergooding tannergooding merged commit 1b2eb12 into dotnet:main Mar 8, 2023
@tannergooding tannergooding deleted the issupported branch March 8, 2023 23:14
@ghost ghost locked as resolved and limited conversation to collaborators Apr 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
linkable-framework Issues associated with delivering a linker friendly framework
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants