[RyuJIT] Change VEX-encoding selection to avoid AVX-SSE transition penalties #8966

fiigii · 2017-09-19T19:00:04Z

Currently, RyuJIT generates AVX instructions (VEX-encoding) when AVX2 is available:

Floating-point calculations and SIMD code use VEX-encoding instructions on AVX2-capable machines (Haswell and above).
SIMD vectors (System.Numerics.Vector<T>) has the size of 256-bit (YMM) on AVX2-capable machines and size of 128-bit (XMM) on machines that support AVX (Sandy Bridge) or blow ISA.

However, we will broadly use AVX instructions via Intel hardware intrinsics even if the underlying hardware has no AVX2. Therefore, mixing use of Avx intrinsics and floating-point calculation (or System.Numerics.Vectors) may trigger AVX-SSE transition penalties with the current codegen strategy. The new VEX-encoding selection strategy should be:

Floating-point calculations use VEX-encoding instructions on AVX-capable machines (Sandy Bridge and above).
SIMD vectors (System.Numerics.Vector<T>) has the size of 256-bit (YMM) on AVX2-capable machines and size of 128-bit (XMM) on machines that support AVX (Sandy Bridge) or blow ISA. // no change
SIMD code (System.Numerics.Vectors) is compiled to instructions that have VEX.128 prefix and operate over XMM registers on AVX-capable machines, but compiled to instructions that have VEX.256 prefix and operate over YMM registers on AVX2-capable machines.

I will provide a PR after finish dotnet/coreclr#14020 .

The text was updated successfully, but these errors were encountered:

fiigii · 2017-09-29T23:12:49Z

@CarolEidt @BruceForstall Question. What is the purpose of FEATURE_AVX_SUPPORT? Should System.Runtime.Intrinsic.X86 be controlled by this config?

CarolEidt · 2017-09-29T23:24:37Z

What is the purpose of FEATURE_AVX_SUPPORT? Should System.Runtime.Intrinsic.X86 be controlled by this config?

FEATURE_AVX_SUPPORT is a compile-time flag that allows us to control whether the JIT will emit AVX instructions (at all). These FEATURE flags are often used when introducing a new feature that is not on be default. @BruceForstall can probably provide additional insight, but IMO this probably no longer works (i.e. if you turned it off, there are likely to be inconsistencies), so it might be best eliminated. It was a bit more useful when AVX was supported for x64 (x86/64) but not for x86(/32).

fiigii · 2017-09-30T00:26:43Z

@CarolEidt Thanks for the reply. If this flag is not useful for x64 and x86 anymore, I would like to eliminate.

CarolEidt · 2017-09-30T00:59:34Z

I would like to eliminate.

That would be great, but I'd like to hear from @BruceForstall whether he knows of any reason it should be kept. Thanks!

fiigii · 2017-10-02T20:50:17Z

@BruceForstall ping?

BruceForstall · 2017-10-02T23:18:54Z

I'm happy to have you eliminate this. I agree with Carol that this was mostly useful as bring-up, and when x64 and x86 didn't have parity here. Any code will still need to be under _TARGET_XARCH_ of course.

I can see only a couple potential snags: (1) FEATURE_AVX_SUPPORT has never been defined for our x86 legacy backend (with LEGACY_BACKEND defined). We don't want any new code paths executing there. See this code in emitxarch.h:

// code_t is a type used to accumulate bits of opcode + prefixes. On amd64, it must be 64 bits
// to support the REX prefixes. On both x86 and amd64, it must be 64 bits to support AVX, with
// its 3-byte VEX prefix. For legacy backend (which doesn't support AVX), leave it as size_t.
#if defined(LEGACY_BACKEND)
typedef size_t code_t;
#else  // !defined(LEGACY_BACKEND)
typedef unsigned __int64 code_t;
#endif // !defined(LEGACY_BACKEND)

So, if you removed FEATURE_AVX_SUPPORTED, then potentially someone could send an AVX instruction through the LEGACY_BACKEND path, and thus you would need to change the above define so code_t is 64 bits. This might be ok, but it also has throughput/memory impact, and makes the legacy backend not quite as "clean" a throughput comparison as we'd like. Overall, I'm ok with this, but you probably do need to change the code_t definition.

(2) It looks like we never changed our (internal) desktop RyuJIT/x86 build to define FEATURE_AVX_SUPPORTED. See protojit\protojit.nativeproj:

<ClDefines Condition="'$(BuildArchitecture)' == 'amd64'">$(ClDefines);FEATURE_SIMD;FEATURE_AVX_SUPPORT</ClDefines>

We never enabled it for SIMD, either. We probably should enable both for both amd64 and x86 (in this file, 'i386'), but that's a Microsoft person should do.

fiigii · 2017-10-02T23:40:30Z

@BruceForstall @CarolEidt Thanks for the information, I will try and submit a PR later.

RussKeldorph assigned fiigii Sep 22, 2017

CarolEidt closed this as completed in dotnet/coreclr#15014 Nov 14, 2017

msftgits transferred this issue from dotnet/coreclr Jan 31, 2020

msftgits added this to the Future milestone Jan 31, 2020

ghost locked as resolved and limited conversation to collaborators Dec 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RyuJIT] Change VEX-encoding selection to avoid AVX-SSE transition penalties #8966

[RyuJIT] Change VEX-encoding selection to avoid AVX-SSE transition penalties #8966

fiigii commented Sep 19, 2017

fiigii commented Sep 29, 2017

CarolEidt commented Sep 29, 2017

fiigii commented Sep 30, 2017

CarolEidt commented Sep 30, 2017

fiigii commented Oct 2, 2017

BruceForstall commented Oct 2, 2017

fiigii commented Oct 2, 2017

[RyuJIT] Change VEX-encoding selection to avoid AVX-SSE transition penalties #8966

[RyuJIT] Change VEX-encoding selection to avoid AVX-SSE transition penalties #8966

Comments

fiigii commented Sep 19, 2017

fiigii commented Sep 29, 2017

CarolEidt commented Sep 29, 2017

fiigii commented Sep 30, 2017

CarolEidt commented Sep 30, 2017

fiigii commented Oct 2, 2017

BruceForstall commented Oct 2, 2017

fiigii commented Oct 2, 2017