Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RyuJIT] Change VEX-encoding selection to avoid AVX-SSE transition penalties #8966

Closed
fiigii opened this issue Sep 19, 2017 · 7 comments · Fixed by dotnet/coreclr#15014
Closed
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Milestone

Comments

@fiigii
Copy link
Contributor

fiigii commented Sep 19, 2017

Currently, RyuJIT generates AVX instructions (VEX-encoding) when AVX2 is available:

  • Floating-point calculations and SIMD code use VEX-encoding instructions on AVX2-capable machines (Haswell and above).
  • SIMD vectors (System.Numerics.Vector<T>) has the size of 256-bit (YMM) on AVX2-capable machines and size of 128-bit (XMM) on machines that support AVX (Sandy Bridge) or blow ISA.

However, we will broadly use AVX instructions via Intel hardware intrinsics even if the underlying hardware has no AVX2. Therefore, mixing use of Avx intrinsics and floating-point calculation (or System.Numerics.Vectors) may trigger AVX-SSE transition penalties with the current codegen strategy. The new VEX-encoding selection strategy should be:

  • Floating-point calculations use VEX-encoding instructions on AVX-capable machines (Sandy Bridge and above).
  • SIMD vectors (System.Numerics.Vector<T>) has the size of 256-bit (YMM) on AVX2-capable machines and size of 128-bit (XMM) on machines that support AVX (Sandy Bridge) or blow ISA. // no change
  • SIMD code (System.Numerics.Vectors) is compiled to instructions that have VEX.128 prefix and operate over XMM registers on AVX-capable machines, but compiled to instructions that have VEX.256 prefix and operate over YMM registers on AVX2-capable machines.

I will provide a PR after finish dotnet/coreclr#14020 .

@fiigii
Copy link
Contributor Author

fiigii commented Sep 29, 2017

@CarolEidt @BruceForstall Question. What is the purpose of FEATURE_AVX_SUPPORT? Should System.Runtime.Intrinsic.X86 be controlled by this config?

@CarolEidt
Copy link
Contributor

What is the purpose of FEATURE_AVX_SUPPORT? Should System.Runtime.Intrinsic.X86 be controlled by this config?

FEATURE_AVX_SUPPORT is a compile-time flag that allows us to control whether the JIT will emit AVX instructions (at all). These FEATURE flags are often used when introducing a new feature that is not on be default. @BruceForstall can probably provide additional insight, but IMO this probably no longer works (i.e. if you turned it off, there are likely to be inconsistencies), so it might be best eliminated. It was a bit more useful when AVX was supported for x64 (x86/64) but not for x86(/32).

@fiigii
Copy link
Contributor Author

fiigii commented Sep 30, 2017

@CarolEidt Thanks for the reply. If this flag is not useful for x64 and x86 anymore, I would like to eliminate.

@CarolEidt
Copy link
Contributor

I would like to eliminate.

That would be great, but I'd like to hear from @BruceForstall whether he knows of any reason it should be kept. Thanks!

@fiigii
Copy link
Contributor Author

fiigii commented Oct 2, 2017

@BruceForstall ping?

@BruceForstall
Copy link
Member

I'm happy to have you eliminate this. I agree with Carol that this was mostly useful as bring-up, and when x64 and x86 didn't have parity here. Any code will still need to be under _TARGET_XARCH_ of course.

I can see only a couple potential snags: (1) FEATURE_AVX_SUPPORT has never been defined for our x86 legacy backend (with LEGACY_BACKEND defined). We don't want any new code paths executing there. See this code in emitxarch.h:

// code_t is a type used to accumulate bits of opcode + prefixes. On amd64, it must be 64 bits
// to support the REX prefixes. On both x86 and amd64, it must be 64 bits to support AVX, with
// its 3-byte VEX prefix. For legacy backend (which doesn't support AVX), leave it as size_t.
#if defined(LEGACY_BACKEND)
typedef size_t code_t;
#else  // !defined(LEGACY_BACKEND)
typedef unsigned __int64 code_t;
#endif // !defined(LEGACY_BACKEND)

So, if you removed FEATURE_AVX_SUPPORTED, then potentially someone could send an AVX instruction through the LEGACY_BACKEND path, and thus you would need to change the above define so code_t is 64 bits. This might be ok, but it also has throughput/memory impact, and makes the legacy backend not quite as "clean" a throughput comparison as we'd like. Overall, I'm ok with this, but you probably do need to change the code_t definition.

(2) It looks like we never changed our (internal) desktop RyuJIT/x86 build to define FEATURE_AVX_SUPPORTED. See protojit\protojit.nativeproj:

<ClDefines Condition="'$(BuildArchitecture)' == 'amd64'">$(ClDefines);FEATURE_SIMD;FEATURE_AVX_SUPPORT</ClDefines>

We never enabled it for SIMD, either. We probably should enable both for both amd64 and x86 (in this file, 'i386'), but that's a Microsoft person should do.

@fiigii
Copy link
Contributor Author

fiigii commented Oct 2, 2017

@BruceForstall @CarolEidt Thanks for the information, I will try and submit a PR later.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants