Skip to content

Conversation

@rcorcs
Copy link
Contributor

@rcorcs rcorcs commented Apr 25, 2025

The code generation of createLoadImmediate for AArch64 was always emitting 4 instructions, regardless of the immediate value being loaded into the 64-bit register. This patch makes sure that only the necessary number of instructions are used depending on the value of the immediate being loaded into a register (ranging from 1 to 4 instructions).
The unit tests created help us to verify this new capability.

@llvmbot
Copy link
Member

llvmbot commented Apr 25, 2025

@llvm/pr-subscribers-bolt

Author: Rodrigo Rocha (rcorcs)

Changes

The code generation of createLoadImmediate for AArch64 was always emitting 4 instructions, regardless of the immediate value being loaded into the 64-bit register. This patch makes sure that only the necessary number of instructions are used depending on the value of the immediate being loaded into a register (ranging from 1 to 4 instructions).
The unit tests created help us to verify this new capability.


Full diff: https://github.com/llvm/llvm-project/pull/137413.diff

2 Files Affected:

  • (modified) bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp (+20-8)
  • (modified) bolt/unittests/Core/MCPlusBuilder.cpp (+84)
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index e00d6a18b0f6c..0aa9504f50a15 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -2173,14 +2173,26 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 
   InstructionListType createLoadImmediate(const MCPhysReg Dest,
                                           uint64_t Imm) const override {
-    InstructionListType Insts(4);
-    int Shift = 48;
-    for (int I = 0; I < 4; I++, Shift -= 16) {
-      Insts[I].setOpcode(AArch64::MOVKXi);
-      Insts[I].addOperand(MCOperand::createReg(Dest));
-      Insts[I].addOperand(MCOperand::createReg(Dest));
-      Insts[I].addOperand(MCOperand::createImm((Imm >> Shift) & 0xFFFF));
-      Insts[I].addOperand(MCOperand::createImm(Shift));
+    InstructionListType Insts;
+    for (int I = 0, Shift = 0; I < 4; I++, Shift += 16) {
+      uint16_t HalfWord = (Imm >> Shift) & 0xFFFF;
+      if (!HalfWord)
+        continue;
+      MCInst Inst;
+      if (Insts.size() == 0) {
+        Inst.setOpcode(AArch64::MOVZXi);
+        Inst.addOperand(MCOperand::createReg(Dest));
+        Inst.addOperand(MCOperand::createImm(HalfWord));
+        Inst.addOperand(MCOperand::createImm(Shift));
+        Insts.push_back(Inst);
+      } else {
+        Inst.setOpcode(AArch64::MOVKXi);
+        Inst.addOperand(MCOperand::createReg(Dest));
+        Inst.addOperand(MCOperand::createReg(Dest));
+        Inst.addOperand(MCOperand::createImm(HalfWord));
+        Inst.addOperand(MCOperand::createImm(Shift));
+        Insts.push_back(Inst);
+      }
     }
     return Insts;
   }
diff --git a/bolt/unittests/Core/MCPlusBuilder.cpp b/bolt/unittests/Core/MCPlusBuilder.cpp
index 7016dec0e3574..ac0529cb09a7b 100644
--- a/bolt/unittests/Core/MCPlusBuilder.cpp
+++ b/bolt/unittests/Core/MCPlusBuilder.cpp
@@ -167,6 +167,90 @@ TEST_P(MCPlusBuilderTester, AArch64_CmpJNE) {
   ASSERT_EQ(Label, BB->getLabel());
 }
 
+TEST_P(MCPlusBuilderTester, AArch64_LoadImm32) {
+  if (GetParam() != Triple::aarch64)
+    GTEST_SKIP();
+  BinaryFunction *BF = BC->createInjectedBinaryFunction("BF", true);
+  std::unique_ptr<BinaryBasicBlock> BB = BF->createBasicBlock();
+
+  InstructionListType Instrs = BC->MIB->createLoadImmediate(AArch64::X0, 2);
+  BB->addInstructions(Instrs.begin(), Instrs.end());
+
+  ASSERT_EQ(BB->size(), 1);
+  auto II = BB->begin();
+  // mov x0, #2
+  ASSERT_EQ(II->getOpcode(), AArch64::MOVZXi);
+  ASSERT_EQ(II->getOperand(0).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(1).getImm(), 2);
+  ASSERT_EQ(II->getOperand(2).getImm(), 0);
+}
+
+TEST_P(MCPlusBuilderTester, AArch64_LoadImm64) {
+  if (GetParam() != Triple::aarch64)
+    GTEST_SKIP();
+  BinaryFunction *BF = BC->createInjectedBinaryFunction("BF", true);
+  std::unique_ptr<BinaryBasicBlock> BB = BF->createBasicBlock();
+
+  int64_t Imm = ((uint64_t)4) << 48 | ((uint64_t)3) << 32 | 2 << 16 | 1;
+  InstructionListType Instrs = BC->MIB->createLoadImmediate(AArch64::X0, Imm);
+  BB->addInstructions(Instrs.begin(), Instrs.end());
+
+  ASSERT_EQ(BB->size(), 4);
+  auto II = BB->begin();
+  // mov x0, #1
+  ASSERT_EQ(II->getOpcode(), AArch64::MOVZXi);
+  ASSERT_EQ(II->getOperand(0).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(1).getImm(), 1);
+  ASSERT_EQ(II->getOperand(2).getImm(), 0);
+  II++;
+  // movk x0, #2, lsl #16
+  ASSERT_EQ(II->getOpcode(), AArch64::MOVKXi);
+  ASSERT_EQ(II->getOperand(0).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(1).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(2).getImm(), 2);
+  ASSERT_EQ(II->getOperand(3).getImm(), 16);
+  II++;
+  // movk x0, #3, lsl #32
+  ASSERT_EQ(II->getOpcode(), AArch64::MOVKXi);
+  ASSERT_EQ(II->getOperand(0).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(1).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(2).getImm(), 3);
+  ASSERT_EQ(II->getOperand(3).getImm(), 32);
+  II++;
+  // movk x0, #4, lsl #48
+  ASSERT_EQ(II->getOpcode(), AArch64::MOVKXi);
+  ASSERT_EQ(II->getOperand(0).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(1).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(2).getImm(), 4);
+  ASSERT_EQ(II->getOperand(3).getImm(), 48);
+}
+
+TEST_P(MCPlusBuilderTester, AArch64_LoadImm64Partial) {
+  if (GetParam() != Triple::aarch64)
+    GTEST_SKIP();
+  BinaryFunction *BF = BC->createInjectedBinaryFunction("BF", true);
+  std::unique_ptr<BinaryBasicBlock> BB = BF->createBasicBlock();
+
+  int64_t Imm = ((uint64_t)4) << 48 | 2 << 16;
+  InstructionListType Instrs = BC->MIB->createLoadImmediate(AArch64::X0, Imm);
+  BB->addInstructions(Instrs.begin(), Instrs.end());
+
+  ASSERT_EQ(BB->size(), 2);
+  auto II = BB->begin();
+  // mov x0, #2, lsl #16
+  ASSERT_EQ(II->getOpcode(), AArch64::MOVZXi);
+  ASSERT_EQ(II->getOperand(0).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(1).getImm(), 2);
+  ASSERT_EQ(II->getOperand(2).getImm(), 16);
+  II++;
+  // movk x0, #4, lsl #48
+  ASSERT_EQ(II->getOpcode(), AArch64::MOVKXi);
+  ASSERT_EQ(II->getOperand(0).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(1).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(2).getImm(), 4);
+  ASSERT_EQ(II->getOperand(3).getImm(), 48);
+}
+
 TEST_P(MCPlusBuilderTester, testAccessedRegsImplicitDef) {
   if (GetParam() != Triple::aarch64)
     GTEST_SKIP();

@maksfb
Copy link
Contributor

maksfb commented May 5, 2025

Hi Rodrigo, thanks for improving the codegen. While looking at LLVM AArch64 backend, I've noticed that there's AArch64_IMM::expandMOVImm() that handles more complex cases of immediate generation. It might be an overkill at this point and there's a level of post-processing required for converting SmallVectorImpl<ImmInsnModel> into a sequence of MCInst, but perhaps it's not that hard and worth the effort in the long run. The conversion into MachineInstr sequence is implemented in AArch64ExpandPseudo::expandMOVImm().

@github-actions
Copy link

github-actions bot commented May 15, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@rcorcs
Copy link
Contributor Author

rcorcs commented May 15, 2025

hi @maksfb , thank you for your suggestion. I've modified the implementation of AArch64MCPlusBuilder to make use of AArch64_IMM::expandMOVImm().
I think it makes sense to reuse the same implementation that is already used by LLVM's AArch64 backend.

@maksfb
Copy link
Contributor

maksfb commented May 20, 2025

Hi Rodrigo,

Please check the test failures.

Thanks for the new implementation! Optional suggestion: you can use MCInstBuilder to build instructions in a way similar to AArch64ExpandPseudo::expandMOVImm().

@rcorcs rcorcs requested a review from paschalis-mpeis as a code owner June 7, 2025 23:45
Copy link
Member

@paschalis-mpeis paschalis-mpeis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Rodrigo,

Thanks so much for your work on improving the AArch64 backend.
I'm currently running into some linking errors when trying out the patch:

ld.lld: error: undefined symbol: llvm::AArch64_IMM::expandMOVImm(unsigned long, unsigned int, llvm::SmallVectorImpl<llvm::AArch64_IMM::ImmInsnModel>&)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants