Skip to content

[X86] VNNI intrinsics argument types don't match the actual computation #97271

@RKSimon

Description

@RKSimon

For example: __m128i _mm_dpbusd_avx_epi32 (__m128i src, __m128i a, __m128i b)

This takes 1 x <4 x i32> "src" and 2 x <16 x i8> "a * b" multiplication inputs but the clang/llvm intrinsics are defined as:

TARGET_BUILTIN(__builtin_ia32_vpdpbusd128, "V4iV4iV4iV4i", "ncV:128:", "avx512vl,avx512vnni|avxvnni")

  def int_x86_avx512_vpdpbusd_128 :
      ClangBuiltin<"__builtin_ia32_vpdpbusd128">,
      DefaultAttrsIntrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, llvm_v4i32_ty,
                             llvm_v4i32_ty], [IntrNoMem]>;

which means we require hardcoded mappings of the src/dst types for any combines that involve them.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions