-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add VPCLMULQDQ intrinsics #109137
base: main
Are you sure you want to change the base?
Add VPCLMULQDQ intrinsics #109137
Conversation
Note regarding the
|
Note regarding the
|
The existing modeling for ISA support presents a challenge in that JIT wants to see a 1:1 mapping between ISA and implementing class, but the actual ISAs are represented by a combination of flags. For my first attempt, I have virtualized this in JIT similarly to the way that e.g. Looking at some of the existing implementations, I see that the fake _VL ISAs are leaking into the I'd appreciate some guidance if there's a better way to handle this scenario since there will be more like this. |
@MichalStrehovsky could I trouble you to look at this? JIT side is working, but I've left it as draft for now because the NAOT leg is failing due to an assert on the (intentionally) unexposed fake ISA.
I'm not sure how much work 2) is in the end, or whether that's something the runtime team cares about The currently broken code I mentioned is runtime/src/coreclr/tools/Common/JitInterface/CorInfoInstructionSet.cs Lines 1896 to 1900 in e70aaa8
because that generated method is matching on managed type name, and there's no I guess there's also option 3) Do it the ugly way for now and hope somebody cleans it up later... |
I don't have much guidance to offer, sorry. Things got a lot more complicated since I last touched any of this (when all we had was AVX2) and I haven't exactly been keeping track of it. E.g. I don't know why _VL instructions are fake and whether we intentionally want/or do not want to support them as Ideally RyuJIT implementation details shouldn't leak out into the managed parts of the compiler or R2R file format, so if RyuJIT needs something fake to operate, it ideally shouldn't burden other components (because then the owners of said component who don't know about RyuJIT implementation details and don't know much about hardware intrinsics in general either have no clue about what's going on). But maybe it's necessary, I don't know. We pulled these RyuJIT implementation details from cpufeatures.h in the past, maybe they can be pulled from more places. @tannergooding and @davidwrighton might have more of an opinion. |
Fair enough. Thanks for the reply anyway. For background, the reason the These newer intersection-style ISAs are more problematic because 1) they don't fall under a well-known x86-64 version set and 2) they actually do exist in hardware independent of each other. For example, Skylake-X implements PCLMULQDQ+AVX512F but not VPCLMULQDQ, Alder Lake implements PCLMULQDQ+VPCLMULQDQ but not AVX512F, and Zen 4 and 5 implement the full set. I'm hoping we can arrive at a better solution for them that also happens to clean up the handling of the ISAs that are already implemented. |
Thanks for the explanation! I agree that given all this, we should ideally not expose _VL as something people can specify on the command line. |
It's spread out in a few places, I think the most recent was in: #103241 (comment) There's a bit of a balance overall between modeling what the CPU exposes (irrespective of the implementation) and modeling something reasonable for users to consume and handle. For With With With So, I think what we want is we need to do is always have virtual instruction sets for any managed exposed ISA class (such as I think what you currently have in the PR roughly models that. We have |
Thanks, Tanner. I think if the decision is to change up the handling of the virtual ISAs in general, that's probably better done in a separate PR, which leads me to believe maybe the best path here is to go ahead and follow the existing pattern for now, and clean them all up later. I've made the required changes to ThunkGenerator to fix the |
This mostly all looks right. I believe you're missing some handling in:
|
This change is in https://github.com/dotnet/runtime/pull/109537/files#diff-65f20fbb1fbd6c815168e9d3b2b358c4fd02aea226f2152099427a594310b876 if that's ok.
Never mind, different map. I'll figure out what's up with this one.
I think maybe the proper way to deal with mono will be to put the nested classes in a separate .cs file and include the .PlatformNotSupported version of that file for mono. I didn't notice the base Pclmulqdq was actually supported there. Does that sound right to you? |
{ | ||
if (potentialType.Name == "X64") | ||
potentialType = (MetadataType)potentialType.ContainingType; | ||
if (potentialType.Name == "VL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This handling of VL
classes assumed the IsSupported
value would be the same as the containing class. Although that assumption is safe, it was unnecessary since they have their own mappings in the dictionary and can be handled by the generic nesting logic.
The special handling of the 64-bit classes had to stay.
src/coreclr/tools/Common/JitInterface/ThunkGenerator/InstructionSetDesc.txt
Show resolved
Hide resolved
OK, this is ready for another review pass. All feedback addressed and updated tests passing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC. @dotnet/jit-contrib for secondary review
Fixes #95772
This is one of several similar new ISAs, where an existing ISA (PCLMULQDQ) was extended to 256-bit with one cpuid flag (VPCLMULQDQ) and then to 512-bit when combined with AVX-512 (VPCLMULQDQ+AVX512F) support.