-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector<T> operations don't take advantage of memory operands #13798
Comments
See dotnet/coreclr#27798 for an example of code gen that would be slightly improved by this. |
SIMD intrinsic that have a 1:1 mapping to HW intrinsics should be easy to import as HW intrinsics: mikedn/coreclr@384934c generates: G_M319_IG01:
C5F877 vzeroupper
G_M319_IG02:
C5FD1002 vmovupd ymm0, ymmword ptr[rdx]
C4C17D7400 vpcmpeqb ymm0, ymm0, ymmword ptr[r8]
C5FD1101 vmovupd ymmword ptr[rcx], ymm0
488BC1 mov rax, rcx
G_M319_IG03:
C5F877 vzeroupper
C3 ret |
I think it might make more sense to do that in simd.cpp. |
Well, I would hope that in time simd.cpp can go away :). Or maybe keep it only for those SIMD intrinsics that do not have 1:1 mapping and require some sort of special handling (not sure how that would look like - either make the HW intrinsic import code more flexible so that it can produce trees, not just nodes, or introduce an abstract "SIMD" ISA that has special codegen handling).
The most likely issue we'll run into is again call args - |
This is readily visible if you try to handle
which is already dubious because the
which is even more broken. Oh well, looks like one more reason to add class layout to |
With dotnet/coreclr#22944, the raw hardware intrinsics are able to take advantage of folding the memory load operation into the SIMD instruction itself.
However, this same optimization was not applied to
Vector
andVector<T>
more generally, even though they're using nearly identical codegen under the covers.category:cq
theme:vector-codegen
skill-level:intermediate
cost:medium
The text was updated successfully, but these errors were encountered: