Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intel architecture improvements for .NET 9 #93196

Closed
21 of 33 tasks
BruceForstall opened this issue Oct 9, 2023 · 14 comments
Closed
21 of 33 tasks

Intel architecture improvements for .NET 9 #93196

BruceForstall opened this issue Oct 9, 2023 · 14 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI User Story A single user-facing feature. Can be grouped under an epic.
Milestone

Comments

@BruceForstall
Copy link
Member

BruceForstall commented Oct 9, 2023

This issue describes planned improvements to Intel architecture (x86, x64) ISA support for .NET 9.

In .NET 8, AVX-512 ISA support was added (see #77034). In .NET 9, this support will be further improved and leveraged for improved performance, especially with expanded libraries utilization of the recently implemented AVX-512 support. Investigations and implementation will start to support the newly announced AVX10.

Libraries work

AVX10

AVX10 is a new set of vector ISA extensions, described here. We expect to begin preliminary work to support AVX10 in .NET 9, at least the parts that most directly map to the already supported AVX-512. An arch-avx10 GitHub label is defined to be added to all related PRs and issues: https://github.com/dotnet/runtime/labels/arch-avx10

RyuJIT feature work

RyuJIT optimization work

API design work

Future Work

Some of the planned work for .NET 9 have been pushed out to future work.

Libraries work

AVX10

RyuJIT feature work

Vector<T>

  • Consider Vector<T> expanding to Vector512<T>, either automatically or opt-in. (@tannergooding plans to get back to it as a best effort.)

JCC erratum

Debugging / diagnostics work (@BruceForstall)

@BruceForstall BruceForstall added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI User Story A single user-facing feature. Can be grouped under an epic. labels Oct 9, 2023
@BruceForstall BruceForstall added this to the 9.0.0 milestone Oct 9, 2023
@ghost
Copy link

ghost commented Oct 9, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

This issue describes planned improvements to Intel architecture (x86, x64) ISA support for .NET 9.

In .NET 8, AVX-512 ISA support was added (see #77034). In .NET 9, this support will be further improved and leveraged for improved performance, especially with expanded libraries utilization of the recently implemented AVX-512 support. Investigations and implementation will start to support the newly announced AVX10.

Libraries work

RyuJIT feature work

  • Consider Vector expanding to Vector512, either automatically or opt-in.

RyuJIT feature work

RyuJIT optimization work

AVX10

AVX10 is a new set of vector ISA extensions, described here. We expect to begin preliminary work to support AVX10 in .NET 9, at least the parts that most directly map to the already supported AVX-512.

  • Add VM/JIT AVX10 awareness: CPUID enumeration and detection
  • Propose a new AVX10 API
  • Do JIT codegen implementation of the API
  • Enhance Vector256 codegen with AVX10 instructions (related to what has already been done for AVX512VL)
  • Allow additional 16 YMM registers for AVX10
  • Allow embedded rounding for YMM/ZMM (related: Enable EVEX embedded rounding support in xarch emitter #93154)
  • Convert remaining AVX2 implementations to Vector256
  • Allow AVX-512 optimizations for YMM (e.g., scalar conversion, vpternlog)

CI/testing work

Debugging / diagnostics work

API design work

Author: BruceForstall
Assignees: -
Labels:

area-CodeGen-coreclr, User Story

Milestone: 9.0.0

@BruceForstall BruceForstall self-assigned this Oct 9, 2023
@MichalPetryka
Copy link
Contributor

Is there maybe any interest in adding the workaround for the JCC erratum (#35730) in .Net 9? I've seen minor codegen improvements be reported as huge regressions because the code started to hit this issue.

@BruceForstall
Copy link
Member Author

Is there maybe any interest in adding the workaround for the JCC erratum (#35730) in .Net 9? I've seen minor codegen improvements be reported as huge regressions because the code started to hit this issue.

@AndyAyersMS has expressed a desire to at least have a mode that could be used for performance testing to avoid the JCC erratum. Whether we could enable this always would depend on how uniform the improvements would be. It is expected there would be some code size regressions -- possibly significant -- due to the need to insert NOPs.

@MichalPetryka
Copy link
Contributor

It is expected there would be some code size regressions -- possibly significant -- due to the need to insert NOPs.

Didn't we already accept that tradeoff with loop alignment?

@BruceForstall
Copy link
Member Author

Didn't we already accept that tradeoff with loop alignment?

Yes, but this could be a very different magnitude of regression that will need to be measured.

@BruceForstall
Copy link
Member Author

I went ahead and created #93243 related to adding a JIT mode to avoid the JCC erratum, and linked it here.

@Spacefish
Copy link
Contributor

I added Vector512 support for Min/Max of simple numeric datatypes in this PR: #93369

@huoyaoyuan
Copy link
Member

What about the upcoming APX extension? It looks like a major change of x86-64. I can see discussions around ABI for APX in GCC mail thread: https://gcc.gnu.org/pipermail/gcc/2023-July/242154.html https://gcc.gnu.org/pipermail/gcc-help/2023-August/142801.html

Maybe it's too early for .NET to adopt APX, but I'd like to see the estimated timeline. Should we wait for MSVC to define the calling convention?

@tannergooding
Copy link
Member

We want to have hardware available on which it can run.

While Intel hasn't given an official timeline as of yet, such hardware is most likely not in the .NET 9 lifetime which ships in November 2024 and will be out of support around May 2026.

I expect this work will be done for .NET 10 which will likely ship around November 2025 (assuming we don't change our current pacing of releases) and be out of support November 2028.

@MichalPetryka
Copy link
Contributor

MichalPetryka commented Feb 27, 2024

We want to have hardware available on which it can run.

Would using Intel SDE not be enough for testing the support for it? It seems to already have support for emulating AVX10 and APX.

@tannergooding
Copy link
Member

There's no point in scheduling work to be done for hardware that doesn't exist yet, particularly if that hardware is unlikely to exist within the lifetime of a release.

That is, we know that AVX10 is going to exist for Granite Rapids, as per the official announcement: https://www.intel.com/content/www/us/en/content-details/784267/intel-advanced-vector-extensions-10-intel-avx10-architecture-specification.html. The AVX10.1 work is correspondingly happening in .NET 9

While no official release date has been announced for APX, it is unlikely to happen in a timeframe that makes .NET 9 a good choice to target.

@JulieLeeMSFT
Copy link
Member

Updated the Planned work with the current status. Marked completed work and moved items that will be pushed out to Future Work section.

@JulieLeeMSFT JulieLeeMSFT modified the milestone: 9.0.0 Jun 18, 2024
@JulieLeeMSFT
Copy link
Member

Moved JCC erratum and Vector work to future work becasue we don't have time to work on them in .NET 9.
Closing this user story as completed.

@JulieLeeMSFT
Copy link
Member

@tommcdon, Debugger team, mentioned: "The debugger has logic to decode x64 instructions for the purposes of setting breakpoints and determining if there are instruction-relative read/writes/jumps. Whenever the JIT generates new instruction encodings that our logic does not understand, we need to regenerate the decoder. Fortunately, we have a tool that can automatically generate the instruction decoder logic - runtime/src/coreclr/debug/ee/amd64/gen_amd64InstrDecode/README.md at main · dotnet/runtime (github.com). Work was done in .NET 8 to support AVX512 EVEX instruction encodings - Support breakpoints on AVX-512 instructions by BruceForstall · Pull Request #89705 · dotnet/runtime (github.com). It’s our current belief that the existing instruction decoder will correctly decode AVX10.1."

@github-actions github-actions bot locked and limited conversation to collaborators Aug 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI User Story A single user-facing feature. Can be grouped under an epic.
Projects
Status: Done
Development

No branches or pull requests

6 participants