[DirectX] The `dx.dot2/3/4` and `dx.dot2add` intrinsics leave vectors around post-scalarization #134569

bogner · 2025-04-07T03:20:47Z

The dx.dot2, dot3, dot4, and dot2add intrinsics are defined as taking vector arguments and have special handling to scalarize those arguments in DXILOpLowering. This causes issues, as the vectors don't get entirely cleaned up before we generate DXIL.

From llvm-beanz/offload-test-suite#61:

# .---command stdout------------
# | Function: main: error: Instructions must be of an allowed type.
# | note: at '%17 = insertelement <3 x float> undef, float %16, i32 0' in block '#0' of function 'main'.
# | Function: main: error: Instructions must be of an allowed type.
# | note: at '%18 = insertelement <3 x float> %17, float %15, i32 1' in block '#0' of function 'main'.
# | Function: main: error: Instructions must be of an allowed type.
# | note: at '%19 = insertelement <3 x float> %18, float %14, i32 2' in block '#0' of function 'main'.
# | Function: main: error: Instructions must be of an allowed type.
# | note: at '%20 = extractelement <3 x float> %19, i32 0' in block '#0' of function 'main'.
# | Function: main: error: Instructions must be of an allowed type.
# | note: at '%21 = extractelement <3 x float> %19, i32 1' in block '#0' of function 'main'.
# | Function: main: error: Instructions must be of an allowed type.
# | note: at '%22 = extractelement <3 x float> %19, i32 2' in block '#0' of function 'main'.
# | Function: main: error: Instructions must be of an allowed type.
# | note: at '%23 = extractelement <3 x float> %19, i32 0' in block '#0' of function 'main'.
# | Function: main: error: Instructions must be of an allowed type.
# | note: at '%24 = extractelement <3 x float> %19, i32 1' in block '#0' of function 'main'.
# | Function: main: error: Instructions must be of an allowed type.
# | note: at '%25 = extractelement <3 x float> %19, i32 2' in block '#0' of function 'main'.
# | Validation failed.

It would be simpler to make these intrinsics match their DXIL ops more closely and simply take scalar arguments. This will allow the scalarizer to do its job earlier in the pipeline.

Alternatively, we could improve the vectorArgExpansion logic in DXILOpLowering to clean up any remaining vector artifacts, but this seems like it would be redundant with the logic in the scalarizer.

The text was updated successfully, but these errors were encountered:

The `dx.dot2`, `dot3`, and `dot4` intrinsics exist purely to lower `dx.fdot`, and they map exactly to the DXIL ops of the same name. Using vectors for their arguments adds unnecessary complexity and causes us to have vector operations that are not trivial to lower post-scalarizer. Similarly, the `dx.dot2add` intrinsic is overly generic for something that only needs to lower to a single `dot2AddHalf` DXIL op. Update its signature to match the operation it lowers to. Fixes llvm#134569.

farzonl · 2025-04-07T15:57:34Z

Matching the dot2/3/4 intrinsics to the dxil op would mean doing custom scalarization in IntrinsicExpansion pass. I had a change that did that and I thought the team came to the conclusion we didn't want to do custom scalarizations like that.

Also for the dot2add case that would be interesting because it forces the scalarization into clang codegen.

The `dx.dot2`, `dot3`, and `dot4` intrinsics exist purely to lower `dx.fdot`, and they map exactly to the DXIL ops of the same name. Using vectors for their arguments adds unnecessary complexity and causes us to have vector operations that are not trivial to lower post-scalarizer. Similarly, the `dx.dot2add` intrinsic is overly generic for something that only needs to lower to a single `dot2AddHalf` DXIL op. Update its signature to match the operation it lowers to. Fixes llvm#134569.

bogner added the backend:DirectX label Apr 7, 2025

bogner added this to HLSL Support Apr 7, 2025

bogner self-assigned this Apr 7, 2025

bogner linked a pull request Apr 7, 2025 that will close this issue

[DirectX] Use scalar arguments for @llvm.dx.dot intrinsics #134570

Open

bogner moved this to Needs Review in HLSL Support Apr 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DirectX] The `dx.dot2/3/4` and `dx.dot2add` intrinsics leave vectors around post-scalarization #134569

[DirectX] The `dx.dot2/3/4` and `dx.dot2add` intrinsics leave vectors around post-scalarization #134569

bogner commented Apr 7, 2025

farzonl commented Apr 7, 2025

[DirectX] The dx.dot2/3/4 and dx.dot2add intrinsics leave vectors around post-scalarization #134569

[DirectX] The dx.dot2/3/4 and dx.dot2add intrinsics leave vectors around post-scalarization #134569

Comments

bogner commented Apr 7, 2025

farzonl commented Apr 7, 2025

[DirectX] The `dx.dot2/3/4` and `dx.dot2add` intrinsics leave vectors around post-scalarization #134569

[DirectX] The `dx.dot2/3/4` and `dx.dot2add` intrinsics leave vectors around post-scalarization #134569