Skip to content

Conversation

@maryammo
Copy link
Contributor

@maryammo maryammo commented Sep 5, 2025

Define the __dmr2048 type to represent the DMR pair introduced by the Dense Math
Facility on PowerPC, and add three Clang builtins corresponding to DMF
cryptography:

__builtin_mma_dmsha2hash
__builtin_mma_dmsha3hash
__builtin_mma_dmxxshapad

The __dmr2048 type is required for the dmsha3hash crypto builtin, and, as with
other PPC MMA and DMR types, its use is strongly restricted.

@maryammo maryammo requested review from RolandF77 and lei137 September 5, 2025 17:55
@maryammo maryammo self-assigned this Sep 5, 2025
@maryammo maryammo added the clang Clang issues not falling into any other category label Sep 5, 2025
@llvmbot llvmbot added lldb clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Sep 5, 2025
@maryammo maryammo removed the request for review from JDevlieghere September 5, 2025 17:55
@llvmbot
Copy link
Member

llvmbot commented Sep 5, 2025

@llvm/pr-subscribers-clang-modules
@llvm/pr-subscribers-lldb
@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-backend-powerpc

@llvm/pr-subscribers-clang

Author: Maryam Moghadas (maryammo)

Changes

Patch is 22.86 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/157152.diff

11 Files Affected:

  • (modified) clang/include/clang/Basic/BuiltinsPPC.def (+7)
  • (modified) clang/include/clang/Basic/PPCTypes.def (+1)
  • (modified) clang/lib/AST/ASTContext.cpp (+1)
  • (modified) clang/lib/CodeGen/TargetBuiltins/PPC.cpp (+2-1)
  • (modified) clang/test/AST/ast-dump-ppc-types.c (+2)
  • (modified) clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c (+41)
  • (modified) clang/test/CodeGen/PowerPC/ppc-dmf-mma-builtin-err.c (+8-1)
  • (modified) clang/test/CodeGen/PowerPC/ppc-dmf-types.c (+156)
  • (modified) clang/test/CodeGenCXX/ppc-mangle-mma-types.cpp (+3)
  • (modified) clang/test/Sema/ppc-dmf-types.c (+100-14)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+1)
diff --git a/clang/include/clang/Basic/BuiltinsPPC.def b/clang/include/clang/Basic/BuiltinsPPC.def
index 22926b6a7d095..017ae65bdafff 100644
--- a/clang/include/clang/Basic/BuiltinsPPC.def
+++ b/clang/include/clang/Basic/BuiltinsPPC.def
@@ -1123,6 +1123,13 @@ UNALIASED_CUSTOM_MMA_BUILTIN(mma_xvbf16ger2, "vW512*VV",
 UNALIASED_CUSTOM_MMA_BUILTIN(mma_pmxvbf16ger2, "vW512*VVi15i15i3",
                              "mma,paired-vector-memops")
 
+UNALIASED_CUSTOM_BUILTIN(mma_dmsha2hash, "vW1024*W1024*Ii", true,
+                         "mma,isa-future-instructions")
+UNALIASED_CUSTOM_BUILTIN(mma_dmsha3hash, "vW2048*Ii", true,
+                         "mma,isa-future-instructions")
+UNALIASED_CUSTOM_BUILTIN(mma_dmxxshapad, "vW1024*VIiIiIi", true,
+                         "mma,isa-future-instructions")
+
 // FIXME: Obviously incomplete.
 
 #undef BUILTIN
diff --git a/clang/include/clang/Basic/PPCTypes.def b/clang/include/clang/Basic/PPCTypes.def
index fc4155ca98b2d..9c0fa9198d5b1 100644
--- a/clang/include/clang/Basic/PPCTypes.def
+++ b/clang/include/clang/Basic/PPCTypes.def
@@ -30,6 +30,7 @@
 #endif
 
 
+PPC_VECTOR_MMA_TYPE(__dmr2048, DMR2048, 2048)
 PPC_VECTOR_MMA_TYPE(__dmr1024, DMR1024, 1024)
 PPC_VECTOR_MMA_TYPE(__vector_quad, VectorQuad, 512)
 PPC_VECTOR_VSX_TYPE(__vector_pair, VectorPair, 256)
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index dca05b41aee77..a5ead63f99680 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -3519,6 +3519,7 @@ static void encodeTypeForFunctionPointerAuth(const ASTContext &Ctx,
     case BuiltinType::VectorQuad:
     case BuiltinType::VectorPair:
     case BuiltinType::DMR1024:
+    case BuiltinType::DMR2048:
       OS << "?";
       return;
 
diff --git a/clang/lib/CodeGen/TargetBuiltins/PPC.cpp b/clang/lib/CodeGen/TargetBuiltins/PPC.cpp
index ba65cf1ce9b90..e71dc9ea523a2 100644
--- a/clang/lib/CodeGen/TargetBuiltins/PPC.cpp
+++ b/clang/lib/CodeGen/TargetBuiltins/PPC.cpp
@@ -1153,7 +1153,8 @@ Value *CodeGenFunction::EmitPPCBuiltinExpr(unsigned BuiltinID,
     }
     if (BuiltinID == PPC::BI__builtin_mma_dmmr ||
         BuiltinID == PPC::BI__builtin_mma_dmxor ||
-        BuiltinID == PPC::BI__builtin_mma_disassemble_dmr) {
+        BuiltinID == PPC::BI__builtin_mma_disassemble_dmr ||
+        BuiltinID == PPC::BI__builtin_mma_dmsha2hash) {
       Address Addr = EmitPointerWithAlignment(E->getArg(1));
       Ops[1] = Builder.CreateLoad(Addr);
     }
diff --git a/clang/test/AST/ast-dump-ppc-types.c b/clang/test/AST/ast-dump-ppc-types.c
index 1c860c268e0ec..25f36f64dde79 100644
--- a/clang/test/AST/ast-dump-ppc-types.c
+++ b/clang/test/AST/ast-dump-ppc-types.c
@@ -17,6 +17,8 @@
 // are correctly defined. We also added checks on a couple of other targets to
 // ensure the types are target-dependent.
 
+// CHECK: TypedefDecl {{.*}} implicit __dmr2048 '__dmr2048'
+//  CHECK: `-BuiltinType {{.*}} '__dmr2048'
 // CHECK: TypedefDecl {{.*}} implicit __dmr1024 '__dmr1024'
 // CHECK: `-BuiltinType {{.*}} '__dmr1024'
 // CHECK: TypedefDecl {{.*}} implicit __vector_quad '__vector_quad'
diff --git a/clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c b/clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c
index c66f5e2a32919..e0d709802d876 100644
--- a/clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c
+++ b/clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c
@@ -126,3 +126,44 @@ void test_dmf_basic2(char *p1, char *res1, char *res2,
   __builtin_mma_build_dmr((__dmr1024*)res2, vv, vv, vv, vv, vv, vv, vv, vv);
   __builtin_mma_disassemble_dmr(res1, (__dmr1024*)p1);
 }
+
+// CHECK-LABEL: @test_dmsha2hash(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = load <1024 x i1>, ptr [[VDMRP1:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    [[TMP1:%.*]] = load <1024 x i1>, ptr [[VDMRP2:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.dmsha2hash(<1024 x i1> [[TMP0]], <1024 x i1> [[TMP1]], i32 1)
+// CHECK-NEXT:    store <1024 x i1> [[TMP2]], ptr [[RESP:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    ret void
+//
+void test_dmsha2hash(unsigned char *vdmrp1, unsigned char *vdmrp2, unsigned char *resp) {
+  __dmr1024 vdmr1 = *((__dmr1024 *)vdmrp1);
+  __dmr1024 vdmr2 = *((__dmr1024 *)vdmrp2);
+  __builtin_mma_dmsha2hash(&vdmr1, &vdmr2, 1);
+  *((__dmr1024 *)resp) = vdmr1;
+}
+
+// CHECK-LABEL: @test_dmsha3hash(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = load <2048 x i1>, ptr [[VDMRPP:%.*]], align 256, !tbaa [[TBAA9:![0-9]+]]
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <2048 x i1> @llvm.ppc.mma.dmsha3hash(<2048 x i1> [[TMP0]], i32 4)
+// CHECK-NEXT:    store <2048 x i1> [[TMP1]], ptr [[RESP:%.*]], align 256, !tbaa [[TBAA9]]
+// CHECK-NEXT:    ret void
+//
+void test_dmsha3hash(unsigned char *vdmrpp,  unsigned char *resp) {
+  __dmr2048 vdmrp = *((__dmr2048 *)vdmrpp);
+  __builtin_mma_dmsha3hash(&vdmrp, 4);
+  *((__dmr2048 *)resp) = vdmrp;
+}
+
+// CHECK-LABEL: @test_dmxxshapad(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = load <1024 x i1>, ptr [[VDMRP:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.dmxxshapad(<1024 x i1> [[TMP0]], <16 x i8> [[VC:%.*]], i32 2, i32 1, i32 5)
+// CHECK-NEXT:    store <1024 x i1> [[TMP1]], ptr [[RESP:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    ret void
+//
+void test_dmxxshapad(unsigned char *vdmrp, vector unsigned char vc, unsigned char *resp) {
+  __dmr1024 vdmr = *((__dmr1024 *)vdmrp);
+  __builtin_mma_dmxxshapad(&vdmr, vc, 2, 1, 5);
+  *((__dmr1024 *)resp) = vdmr;
+}
diff --git a/clang/test/CodeGen/PowerPC/ppc-dmf-mma-builtin-err.c b/clang/test/CodeGen/PowerPC/ppc-dmf-mma-builtin-err.c
index ea2b99b0e5b20..98e9eb437742a 100644
--- a/clang/test/CodeGen/PowerPC/ppc-dmf-mma-builtin-err.c
+++ b/clang/test/CodeGen/PowerPC/ppc-dmf-mma-builtin-err.c
@@ -4,7 +4,8 @@
 // RUN:   %s -emit-llvm-only 2>&1 | FileCheck %s
 
 __attribute__((target("no-mma")))
-void test_mma(unsigned char *vdmrp, unsigned char *vpp, vector unsigned char vc) {
+void test_mma(unsigned char *vdmrpp, unsigned char *vdmrp, unsigned char *vpp, vector unsigned char vc) {
+  __dmr2048 vdmrpair = *((__dmr2048 *)vdmrpp);
   __dmr1024 vdmr = *((__dmr1024 *)vdmrp);
   __vector_pair vp = *((__vector_pair *)vpp);
   __builtin_mma_dmxvi8gerx4(&vdmr, vp, vc);
@@ -18,6 +19,9 @@ void test_mma(unsigned char *vdmrp, unsigned char *vpp, vector unsigned char vc)
   __builtin_mma_dmxor(&vdmr, (__dmr1024*)vpp);
   __builtin_mma_build_dmr(&vdmr, vc, vc, vc, vc, vc, vc, vc, vc);
   __builtin_mma_disassemble_dmr(vdmrp, &vdmr);
+  __builtin_mma_dmsha2hash(&vdmr, &vdmr, 0);
+  __builtin_mma_dmsha3hash(&vdmrpair, 0);
+  __builtin_mma_dmxxshapad(&vdmr, vc, 0, 0, 0);
 
 // CHECK: error: '__builtin_mma_dmxvi8gerx4' needs target feature mma,paired-vector-memops
 // CHECK: error: '__builtin_mma_pmdmxvi8gerx4' needs target feature mma,paired-vector-memops
@@ -30,4 +34,7 @@ void test_mma(unsigned char *vdmrp, unsigned char *vpp, vector unsigned char vc)
 // CHECK: error: '__builtin_mma_dmxor' needs target feature mma,isa-future-instructions
 // CHECK: error: '__builtin_mma_build_dmr' needs target feature mma,isa-future-instructions
 // CHECK: error: '__builtin_mma_disassemble_dmr' needs target feature mma,isa-future-instructions
+// CHECK: error: '__builtin_mma_dmsha2hash' needs target feature mma,isa-future-instructions
+// CHECK: error: '__builtin_mma_dmsha3hash' needs target feature mma,isa-future-instructions
+// CHECK: error: '__builtin_mma_dmxxshapad' needs target feature mma,isa-future-instructions
 }
diff --git a/clang/test/CodeGen/PowerPC/ppc-dmf-types.c b/clang/test/CodeGen/PowerPC/ppc-dmf-types.c
index 9dff289370eb5..fbbe62133763e 100644
--- a/clang/test/CodeGen/PowerPC/ppc-dmf-types.c
+++ b/clang/test/CodeGen/PowerPC/ppc-dmf-types.c
@@ -2,6 +2,162 @@
 // RUN: %clang_cc1 -triple powerpc64le-linux-unknown -target-cpu future \
 // RUN:   -emit-llvm -o - %s | FileCheck %s
 
+// CHECK-LABEL: @test_dmrp_copy(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[PTR1_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[PTR2_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[PTR1:%.*]], ptr [[PTR1_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[PTR2:%.*]], ptr [[PTR2_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR1_ADDR]], align 8
+// CHECK-NEXT:    [[ADD_PTR:%.*]] = getelementptr inbounds <2048 x i1>, ptr [[TMP0]], i64 2
+// CHECK-NEXT:    [[TMP1:%.*]] = load <2048 x i1>, ptr [[ADD_PTR]], align 256
+// CHECK-NEXT:    [[TMP2:%.*]] = load ptr, ptr [[PTR2_ADDR]], align 8
+// CHECK-NEXT:    [[ADD_PTR1:%.*]] = getelementptr inbounds <2048 x i1>, ptr [[TMP2]], i64 1
+// CHECK-NEXT:    store <2048 x i1> [[TMP1]], ptr [[ADD_PTR1]], align 256
+// CHECK-NEXT:    ret void
+//
+void test_dmrp_copy(__dmr2048 *ptr1, __dmr2048 *ptr2) {
+  *(ptr2 + 1) = *(ptr1 + 2);
+}
+
+// CHECK-LABEL: @test_dmrp_typedef(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[INP_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[OUTP_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRPIN:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRPOUT:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[INP:%.*]], ptr [[INP_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[OUTP:%.*]], ptr [[OUTP_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[INP_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRPIN]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[OUTP_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP1]], ptr [[VDMRPOUT]], align 8
+// CHECK-NEXT:    [[TMP2:%.*]] = load ptr, ptr [[VDMRPIN]], align 8
+// CHECK-NEXT:    [[TMP3:%.*]] = load <2048 x i1>, ptr [[TMP2]], align 256
+// CHECK-NEXT:    [[TMP4:%.*]] = load ptr, ptr [[VDMRPOUT]], align 8
+// CHECK-NEXT:    store <2048 x i1> [[TMP3]], ptr [[TMP4]], align 256
+// CHECK-NEXT:    ret void
+//
+void test_dmrp_typedef(int *inp, int *outp) {
+  __dmr2048 *vdmrpin = (__dmr2048 *)inp;
+  __dmr2048 *vdmrpout = (__dmr2048 *)outp;
+  *vdmrpout = *vdmrpin;
+}
+
+// CHECK-LABEL: @test_dmrp_arg(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[VDMRP_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[PTR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRPP:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[VDMRP:%.*]], ptr [[VDMRP_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[PTR:%.*]], ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRPP]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[VDMRP_ADDR]], align 8
+// CHECK-NEXT:    [[TMP2:%.*]] = load <2048 x i1>, ptr [[TMP1]], align 256
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr, ptr [[VDMRPP]], align 8
+// CHECK-NEXT:    store <2048 x i1> [[TMP2]], ptr [[TMP3]], align 256
+// CHECK-NEXT:    ret void
+//
+void test_dmrp_arg(__dmr2048 *vdmrp, int *ptr) {
+  __dmr2048 *vdmrpp = (__dmr2048 *)ptr;
+  *vdmrpp = *vdmrp;
+}
+
+// CHECK-LABEL: @test_dmrp_const_arg(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[VDMRP_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[PTR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRPP:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[VDMRP:%.*]], ptr [[VDMRP_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[PTR:%.*]], ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRPP]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[VDMRP_ADDR]], align 8
+// CHECK-NEXT:    [[TMP2:%.*]] = load <2048 x i1>, ptr [[TMP1]], align 256
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr, ptr [[VDMRPP]], align 8
+// CHECK-NEXT:    store <2048 x i1> [[TMP2]], ptr [[TMP3]], align 256
+// CHECK-NEXT:    ret void
+//
+void test_dmrp_const_arg(const __dmr2048 *const vdmrp, int *ptr) {
+  __dmr2048 *vdmrpp = (__dmr2048 *)ptr;
+  *vdmrpp = *vdmrp;
+}
+
+// CHECK-LABEL: @test_dmrp_array_arg(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[VDMRPA_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[PTR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRPP:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[VDMRPA:%.*]], ptr [[VDMRPA_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[PTR:%.*]], ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRPP]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[VDMRPA_ADDR]], align 8
+// CHECK-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds <2048 x i1>, ptr [[TMP1]], i64 0
+// CHECK-NEXT:    [[TMP2:%.*]] = load <2048 x i1>, ptr [[ARRAYIDX]], align 256
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr, ptr [[VDMRPP]], align 8
+// CHECK-NEXT:    store <2048 x i1> [[TMP2]], ptr [[TMP3]], align 256
+// CHECK-NEXT:    ret void
+//
+void test_dmrp_array_arg(__dmr2048 vdmrpa[], int *ptr) {
+  __dmr2048 *vdmrpp = (__dmr2048 *)ptr;
+  *vdmrpp = vdmrpa[0];
+}
+
+// CHECK-LABEL: @test_dmrp_ret_const(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[PTR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRPP:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[PTR:%.*]], ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRPP]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[VDMRPP]], align 8
+// CHECK-NEXT:    [[ADD_PTR:%.*]] = getelementptr inbounds <2048 x i1>, ptr [[TMP1]], i64 2
+// CHECK-NEXT:    ret ptr [[ADD_PTR]]
+//
+const __dmr2048 *test_dmrp_ret_const(int *ptr) {
+  __dmr2048 *vdmrpp = (__dmr2048 *)ptr;
+  return vdmrpp + 2;
+}
+
+// CHECK-LABEL: @test_dmrp_sizeof_alignof(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[PTR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRPP:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRP:%.*]] = alloca <2048 x i1>, align 256
+// CHECK-NEXT:    [[SIZET:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[ALIGNT:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[SIZEV:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[ALIGNV:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store ptr [[PTR:%.*]], ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRPP]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[VDMRPP]], align 8
+// CHECK-NEXT:    [[TMP2:%.*]] = load <2048 x i1>, ptr [[TMP1]], align 256
+// CHECK-NEXT:    store <2048 x i1> [[TMP2]], ptr [[VDMRP]], align 256
+// CHECK-NEXT:    store i32 256, ptr [[SIZET]], align 4
+// CHECK-NEXT:    store i32 256, ptr [[ALIGNT]], align 4
+// CHECK-NEXT:    store i32 256, ptr [[SIZEV]], align 4
+// CHECK-NEXT:    store i32 256, ptr [[ALIGNV]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[SIZET]], align 4
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[ALIGNT]], align 4
+// CHECK-NEXT:    [[ADD:%.*]] = add i32 [[TMP3]], [[TMP4]]
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[SIZEV]], align 4
+// CHECK-NEXT:    [[ADD1:%.*]] = add i32 [[ADD]], [[TMP5]]
+// CHECK-NEXT:    [[TMP6:%.*]] = load i32, ptr [[ALIGNV]], align 4
+// CHECK-NEXT:    [[ADD2:%.*]] = add i32 [[ADD1]], [[TMP6]]
+// CHECK-NEXT:    ret i32 [[ADD2]]
+//
+int test_dmrp_sizeof_alignof(int *ptr) {
+  __dmr2048 *vdmrpp = (__dmr2048 *)ptr;
+  __dmr2048 vdmrp = *vdmrpp;
+  unsigned sizet = sizeof(__dmr2048);
+  unsigned alignt = __alignof__(__dmr2048);
+   unsigned sizev = sizeof(vdmrp);
+  unsigned alignv = __alignof__(vdmrp);
+  return sizet + alignt + sizev + alignv;
+}
 
 // CHECK-LABEL: @test_dmr_copy(
 // CHECK-NEXT:  entry:
diff --git a/clang/test/CodeGenCXX/ppc-mangle-mma-types.cpp b/clang/test/CodeGenCXX/ppc-mangle-mma-types.cpp
index 1e213e7f75127..6b792dceba2c6 100644
--- a/clang/test/CodeGenCXX/ppc-mangle-mma-types.cpp
+++ b/clang/test/CodeGenCXX/ppc-mangle-mma-types.cpp
@@ -7,6 +7,9 @@
 // RUN: %clang_cc1 -triple powerpc64le-linux-unknown -target-cpu pwr8 %s \
 // RUN:   -emit-llvm -o - | FileCheck %s
 
+// CHECK: _Z1fPu9__dmr2048
+void f(__dmr2048 *vdmrp) {}
+
 // CHECK: _Z2f0Pu9__dmr1024
 void f0(__dmr1024 *vdmr) {}
 
diff --git a/clang/test/Sema/ppc-dmf-types.c b/clang/test/Sema/ppc-dmf-types.c
index b3da72df25081..88926acf2d3fb 100644
--- a/clang/test/Sema/ppc-dmf-types.c
+++ b/clang/test/Sema/ppc-dmf-types.c
@@ -12,47 +12,86 @@
 
 // typedef
 typedef __dmr1024 dmr_t;
+typedef __dmr2048 dmrp_t;
 
 // function argument
-void testDmrArg1(__dmr1024 vdmr, int *ptr) { // expected-error {{invalid use of PPC MMA type}}
-  __dmr1024 *vdmrp = (__dmr1024 *)ptr;
+void testDmrArg1(dmr_t vdmr, int *ptr) { // expected-error {{invalid use of PPC MMA type}}
+  dmr_t *vdmrp = (dmr_t *)ptr;
   *vdmrp = vdmr;
 }
 
-void testDmrArg2(const __dmr1024 vdmr, int *ptr) { // expected-error {{invalid use of PPC MMA type}}
-  __dmr1024 *vdmrp = (__dmr1024 *)ptr;
+void testDmrArg2(const dmr_t vdmr, int *ptr) { // expected-error {{invalid use of PPC MMA type}}
+  dmr_t *vdmrp = (dmr_t *)ptr;
   *vdmrp = vdmr;
 }
 
 void testDmrArg3(const dmr_t vdmr, int *ptr) { // expected-error {{invalid use of PPC MMA type}}
-  __dmr1024 *vdmrp = (__dmr1024 *)ptr;
+  dmr_t *vdmrp = (dmr_t *)ptr;
   *vdmrp = vdmr;
 }
 
+void testDmrPArg1(const dmrp_t vdmrp, int *ptr) { // expected-error {{invalid use of PPC MMA type}}
+  dmrp_t *vdmrpp = (dmrp_t *)ptr;
+  *vdmrpp = vdmrp;
+}
+
+void testDmrPArg2(const dmrp_t vdmrp, int *ptr) { // expected-error {{invalid use of PPC MMA type}}
+  dmrp_t *vdmrpp = (dmrp_t *)ptr;
+  *vdmrpp = vdmrp;
+}
+
+void testDmrPArg3(const dmrp_t vdmrp, int *ptr) { // expected-error {{invalid use of PPC MMA type}}
+  dmrp_t *vdmrpp = (dmrp_t *)ptr;
+  *vdmrpp = vdmrp;
+}
+
 // function return
-__dmr1024 testDmrRet1(int *ptr) { // expected-error {{invalid use of PPC MMA type}}
-  __dmr1024 *vdmrp = (__dmr1024 *)ptr;
+dmr_t testDmrRet1(int *ptr) { // expected-error {{invalid use of PPC MMA type}}
+  dmr_t *vdmrp = (dmr_t *)ptr;
   return *vdmrp; // expected-error {{invalid use of PPC MMA type}}
 }
 
 const dmr_t testDmrRet4(int *ptr) { // expected-error {{invalid use of PPC MMA type}}
-  __dmr1024 *vdmrp = (__dmr1024 *)ptr;
+  dmr_t *vdmrp = (dmr_t *)ptr;
   return *vdmrp; // expected-error {{invalid use of PPC MMA type}}
 }
 
+dmrp_t testDmrPRet1(int *ptr) { // expected-error {{invalid use of PPC MMA type}}
+  dmrp_t *vdmrpp = (dmrp_t *)ptr;
+  return *vdmrpp; // expected-error {{invalid use of PPC MMA type}}
+}
+
+const dmrp_t testDmrPRet4(int *ptr) { // expected-error {{invalid use of PPC MMA type}}
+  dmrp_t *vdmrpp = (dmrp_t *)ptr;
+  return *vdmrpp; // expected-error {{invalid use of PPC MMA type}}
+}
+
 // global
-__dmr1024 globalvdmr;        // expected-error {{invalid use of PPC MMA type}}
-const __dmr1024 globalvdmr2; // expected-error {{invalid use of PPC MMA type}}
-__dmr1024 *globalvdmrp;
-const __dmr1024 *const globalvdmrp2;
+dmr_t globalvdmr;        // expected-error {{invalid use of PPC MMA type}}
+const dmr_t globalvdmr2; // expected-error {{invalid use of PPC MMA type}}
+dmr_t *globalvdmrp;
+const dmr_t *const globalvdmrp2;
 dmr_t globalvdmr_t; // expected-error {{invalid use of PPC MMA type}}
 
+dmrp_t globalvdmrp;        // expected-error {{invalid use of PPC MMA type}}
+const dmrp_t globalvdmrp2; // expected-error {{invalid use of PPC MMA type}}
+dmrp_t *globalvdmrpp;
+const dmrp_t *const globalvdmrpp2;
+dmrp_t globalvdmrp_t; // expected-error {{invalid use of PPC MMA type}}
+
 // struct field
 struct TestDmrStruct {
   int a;
   float b;
-  __dmr1024 c; // expected-error {{invalid use of PPC MMA type}}
-  __dmr1024 *vq;
+  dmr_t c; // expected-error {{invalid use of PPC MMA type}}
+  dmr_t *vq;
+};
+
+struct TestDmrPStruct {
+  int a;
+  float b;
+  dmrp_t c; // expected-error {{invalid use of PPC MMA type}}
+  dmrp_t *vq;
 };
 
 // operators
@@ -101,3 +140,50 @@ void testDmrOperators4(int v, void *ptr) {
   __dmr1024 vdmr1 = (__dmr1024)v;   // expected-error {{used type '__dmr1024' where arithmetic or pointer type is required}}
   __dmr1024 vdmr2 = (__dmr1024)vdmrp; // expected-error {{used type '__dmr1024' where arithmetic or pointer type is required}}
 }
+
+int testDmrPOperators1(int *ptr) {
+  __dmr2048 *vdmrpp = (__dmr2048 *)ptr;
+  __dmr2048 vdmrp1 = *(vdmrpp + 0);
+  __dmr2048 vdmrp2 = *(vdmrpp + 1);
+  __dmr2048 vdmrp3 = *(vdmrpp + 2);
+  if (vdmrp1)...
[truncated]

Copy link
Contributor

@lei137 lei137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a lot of tests to identify when this type is not valid I think it would be good to add to the PR description where/how this new type can be used.

"mma,isa-future-instructions")
UNALIASED_CUSTOM_BUILTIN(mma_dmsha3hash, "vW2048*Ii", true,
"mma,isa-future-instructions")
UNALIASED_CUSTOM_BUILTIN(mma_dmxxshapad, "vW1024*VIiIiIi", true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to keep these with the other "CUSTOM_BUILTIN(mma*" are defined. After 1106.

@@ -17,6 +17,8 @@
// are correctly defined. We also added checks on a couple of other targets to
// ensure the types are target-dependent.

// CHECK: TypedefDecl {{.*}} implicit __dmr2048 '__dmr2048'
// CHECK: `-BuiltinType {{.*}} '__dmr2048'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indent is off.

@maryammo maryammo requested a review from lei137 September 26, 2025 17:18
Copy link
Contributor

@lei137 lei137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thx

@maryammo maryammo force-pushed the dmr2048/crypto_builtins branch from 92fffaa to a2b57e9 Compare September 29, 2025 23:19
@maryammo
Copy link
Contributor Author

After rebasing to ToT to resolve the conflicts in
clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c
clang/test/CodeGen/PowerPC/ppc-dmf-mma-builtin-eclrr.c
I ran into build failure as it hits the limit of predefined type IDs in Clang’s serialization, so increasing the constant NUM_PREDEF_TYPE_IDS to 514 in clang/include/clang/Serialization/ASTBitCodes.h in the next commit.

@llvmbot llvmbot added the clang:modules C++20 modules and Clang Header Modules label Sep 29, 2025
@maryammo maryammo merged commit ff14953 into llvm:main Sep 30, 2025
10 checks passed
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
)

Define the __dmr2048 type to represent the DMR pair introduced by the
Dense Math Facility on PowerPC, and add three Clang builtins
corresponding to DMF cryptography:

__builtin_mma_dmsha2hash
__builtin_mma_dmsha3hash
__builtin_mma_dmxxshapad

The __dmr2048 type is required for the dmsha3hash crypto builtin, and,
as withother PPC MMA and DMR types, its use is strongly restricted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:PowerPC clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category lldb

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants