-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DirectX] The dx.dot2/3/4
and dx.dot2add
intrinsics leave vectors around post-scalarization
#134569
Labels
Comments
bogner
added a commit
to bogner/llvm-project
that referenced
this issue
Apr 7, 2025
The `dx.dot2`, `dot3`, and `dot4` intrinsics exist purely to lower `dx.fdot`, and they map exactly to the DXIL ops of the same name. Using vectors for their arguments adds unnecessary complexity and causes us to have vector operations that are not trivial to lower post-scalarizer. Similarly, the `dx.dot2add` intrinsic is overly generic for something that only needs to lower to a single `dot2AddHalf` DXIL op. Update its signature to match the operation it lowers to. Fixes llvm#134569.
bogner
added a commit
to bogner/llvm-project
that referenced
this issue
Apr 7, 2025
The `dx.dot2`, `dot3`, and `dot4` intrinsics exist purely to lower `dx.fdot`, and they map exactly to the DXIL ops of the same name. Using vectors for their arguments adds unnecessary complexity and causes us to have vector operations that are not trivial to lower post-scalarizer. Similarly, the `dx.dot2add` intrinsic is overly generic for something that only needs to lower to a single `dot2AddHalf` DXIL op. Update its signature to match the operation it lowers to. Fixes llvm#134569.
Matching the dot2/3/4 intrinsics to the dxil op would mean doing custom scalarization in IntrinsicExpansion pass. I had a change that did that and I thought the team came to the conclusion we didn't want to do custom scalarizations like that. Also for the dot2add case that would be interesting because it forces the scalarization into clang codegen. |
bogner
added a commit
to bogner/llvm-project
that referenced
this issue
Apr 8, 2025
The `dx.dot2`, `dot3`, and `dot4` intrinsics exist purely to lower `dx.fdot`, and they map exactly to the DXIL ops of the same name. Using vectors for their arguments adds unnecessary complexity and causes us to have vector operations that are not trivial to lower post-scalarizer. Similarly, the `dx.dot2add` intrinsic is overly generic for something that only needs to lower to a single `dot2AddHalf` DXIL op. Update its signature to match the operation it lowers to. Fixes llvm#134569.
bogner
added a commit
to bogner/llvm-project
that referenced
this issue
Apr 8, 2025
The `dx.dot2`, `dot3`, and `dot4` intrinsics exist purely to lower `dx.fdot`, and they map exactly to the DXIL ops of the same name. Using vectors for their arguments adds unnecessary complexity and causes us to have vector operations that are not trivial to lower post-scalarizer. Similarly, the `dx.dot2add` intrinsic is overly generic for something that only needs to lower to a single `dot2AddHalf` DXIL op. Update its signature to match the operation it lowers to. Fixes llvm#134569.
bogner
added a commit
to bogner/llvm-project
that referenced
this issue
Apr 9, 2025
The `dx.dot2`, `dot3`, and `dot4` intrinsics exist purely to lower `dx.fdot`, and they map exactly to the DXIL ops of the same name. Using vectors for their arguments adds unnecessary complexity and causes us to have vector operations that are not trivial to lower post-scalarizer. Similarly, the `dx.dot2add` intrinsic is overly generic for something that only needs to lower to a single `dot2AddHalf` DXIL op. Update its signature to match the operation it lowers to. Fixes llvm#134569.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The
dx.dot2
,dot3
,dot4
, anddot2add
intrinsics are defined as taking vector arguments and have special handling to scalarize those arguments inDXILOpLowering
. This causes issues, as the vectors don't get entirely cleaned up before we generate DXIL.From llvm-beanz/offload-test-suite#61:
It would be simpler to make these intrinsics match their DXIL ops more closely and simply take scalar arguments. This will allow the scalarizer to do its job earlier in the pipeline.
Alternatively, we could improve the vectorArgExpansion logic in DXILOpLowering to clean up any remaining vector artifacts, but this seems like it would be redundant with the logic in the scalarizer.
The text was updated successfully, but these errors were encountered: