-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NO REVIEW] Update the CPUID and XSAVE logics for APX #103019
Conversation
Update the asm code in context2.S hard coded the memory offset with an assumption that we are using standrad form Seems the logic in context2.S is for custom stack, we may not need to follow the standard XSAVE buffer layout. Directly adding the APX section after AVX512. Updates in threadsuspend.cpp script-gen changes Extending CPUID flag from int to long to hold more ISAs. Update the cpuid check logic, make sure CR4.OSXSAVE and XCR0.APX_F are checked. resolve comments improve the logics in isa_detection in gc. add missing method definitions under unix context. code clean up
2. merge Avx512F, BW, CD, DQ to a converged ISA flag, Avx512. 3. introduce APX cpuid detection.
I've gone ahead and marked this as |
Yes, please mark it as |
apx-doc/README.md
Outdated
# APX Integration in .NET | ||
|
||
Let's keep documentation on APX integration and notes on things here. I will evolve this as necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should move this under docs/design/features/xarch-apx.md
or maybe under docs/design/coreclr/jit/xarch-apx.md
, just to keep it in the same general place as our other docs.
I like the idea of having a documentation covering how the feature is used and integrates with the rest of the JIT.
public const int Avx10v1 = 0x4000000; | ||
public const int Avx10v1_v256 = 0x8000000; | ||
public const int Avx10v1_v512 = 0x10000000; | ||
public const int Avx512 = 0x8000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you would like to make these bits more compact, it would be better to submit it as separate PR. There are free bits now so you do not need it to add the Apx bit.
It may be better to avoid folding the features together at the PAL level:
public const int Avx512f = ...;
public const int Avx512f_vl = ...;
public const int Avx512bw = ...;
public const int Avx512cd = ...;
public const int Avx512dq = ...;
public const int Avx512Vbmi = ...;
I understand that it does not save as many bits, but it is cleaner (avoids applying JIT policy at the PAL level).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the inputs, I can work on this part soon
|
||
if (__builtin_cpu_supports("avx2")) | ||
{ | ||
IsaFlags |= (int)SupportedISA::APX; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not necessary to detect APX here until the APX-specific implementation of the GC sort gets added.
// TODO-xarch-apx: the definition of XSTATE mask value for APX is now missing on the OS level, | ||
// we are currently using bare value to hack it through the build process, and test the implementation through CI. | ||
// those changes will be removed when we have the OS support for APX. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not do this via the following so that things can still function as expected, even if built on a downlevel OS without the support?
#ifndef XSTATE_APX
#define XSTATE_APX 19
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this way locally on my own dev machine with Windows OS, the build will fail with error report saying the macros are not defined, and it seems that, at least on Windows, XSTATE_APX
and XSTATE_MASK_APX
shall be defined in winnt.h
(I am not 100% sure about the reason, is it because we are using Windows API in the CPUID checks?), I can fix the error by adding the definition there, but it might not be the solution in CI environment.
InstructionSet_AVX10v1_X64=68, | ||
InstructionSet_AVX10v1_V256_X64=69, | ||
InstructionSet_AVX10v1_V512_X64=70, | ||
InstructionSet_APX=36, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does APX support 32bit at all?
Intel®Advanced Performance Extensions (Intel® APX) expands the Intel® 64 instruction set
or is everything just generated with both x86 and x64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing this up,
No, APX features is available only under 64-bit mode
Reference: https://www.intel.com/content/www/us/en/content-details/819797/intel-advanced-performance-extensions-intel-apx-architecture-specification.html
3.1.4 Intel® APX features are only available in IA-32e 64-bit Protected Mode, and are an XSAVE-enabled feature
which requires XCR0 enabling before using the new Intel® APX ISA, new Intel® APX prefixes (REX2) and
prefix extensions (EVEX extensions). See section 3.1.4.2 for details on XCR0-enabling for Intel® APX
Based on this fact, I understand APX could be different from existing ISAs and we are open to take suggestions on this part.
Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it. |
Since the |
[No need to review this PR for now]
Overview on the changes:
XArchIntrinsicConstants
: Compress all the Avx512 related flags into 1 -Avx512f+bw+cd+dq+vl
toAvx512
, this saves more space inXArchIntrinsicConstants
so that we can hold more x86 ISAs, like here, APX.CR4[XSAVE]
(existing) ->XCR0[APX_F]
->CPUID(7,1).EDX[APX_F]
- the current status is that due to the missing macro definition for APX on the OS level, the second check will fail anyways, and it may break the build on CI (to be verified).