Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoongArch] Fix GOT usage for non-dso_local function calls in large code model #117099

Conversation

wangleiat
Copy link
Contributor

@wangleiat wangleiat commented Nov 21, 2024

This commit fixes an issue in the large code model where non-dso_local
function calls did not use the GOT as expected in PIC mode. Instead,
direct PC-relative access was incorrectly applied, leading to linker
errors when building shared libraries.

For ExternalSymbol, it is not possible to determine whether it is
dso_local during pseudo-instruction expansion. We use target flags to
differentiate whether GOT should be used.

Created using spr 1.3.5-bogner
@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2024

@llvm/pr-subscribers-backend-loongarch

Author: wanglei (wangleiat)

Changes

This commit fixes an issue in the large code model where non-dso_local
function calls did not use the GOT as expected in PIC mode. Instead,
direct PC-relative access was incorrectly applied, leading to linker
errors when building shared libraries.


Full diff: https://github.com/llvm/llvm-project/pull/117099.diff

6 Files Affected:

  • (modified) llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp (+1-1)
  • (modified) llvm/test/CodeGen/LoongArch/code-models.ll (+5-5)
  • (modified) llvm/test/CodeGen/LoongArch/expand-adjacency.ll (+4-4)
  • (modified) llvm/test/CodeGen/LoongArch/machinelicm-address-pseudos.ll (+10-10)
  • (modified) llvm/test/CodeGen/LoongArch/psabi-restricted-scheduling.ll (+20-20)
  • (modified) llvm/test/CodeGen/LoongArch/tls-models.ll (+10-10)
diff --git a/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp b/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp
index b7b8987e4084b6..30742c79653b52 100644
--- a/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp
@@ -564,7 +564,7 @@ bool LoongArchPreRAExpandPseudo::expandFunctionCALL(
             ? MF->getRegInfo().createVirtualRegister(&LoongArch::GPRRegClass)
             : LoongArch::R1;
 
-    bool UseGOT = Func.isGlobal() && !Func.getGlobal()->isDSOLocal();
+    bool UseGOT = Func.getTargetFlags() == LoongArchII::MO_CALL_PLT;
     unsigned MO = UseGOT ? LoongArchII::MO_GOT_PC_HI : LoongArchII::MO_PCREL_LO;
     unsigned LAOpcode = UseGOT ? LoongArch::LDX_D : LoongArch::ADD_D;
     expandLargeAddressLoad(MBB, MBBI, NextMBBI, LAOpcode, MO, Func, AddrReg,
diff --git a/llvm/test/CodeGen/LoongArch/code-models.ll b/llvm/test/CodeGen/LoongArch/code-models.ll
index 14bd0f4df47109..c0120688623343 100644
--- a/llvm/test/CodeGen/LoongArch/code-models.ll
+++ b/llvm/test/CodeGen/LoongArch/code-models.ll
@@ -82,11 +82,11 @@ define void @call_external_sym(ptr %dst) {
 ; LARGE-NEXT:    .cfi_offset 1, -8
 ; LARGE-NEXT:    ori $a2, $zero, 1000
 ; LARGE-NEXT:    move $a1, $zero
-; LARGE-NEXT:    pcalau12i $a3, %pc_hi20(memset)
-; LARGE-NEXT:    addi.d $ra, $zero, %pc_lo12(memset)
-; LARGE-NEXT:    lu32i.d $ra, %pc64_lo20(memset)
-; LARGE-NEXT:    lu52i.d $ra, $ra, %pc64_hi12(memset)
-; LARGE-NEXT:    add.d $ra, $ra, $a3
+; LARGE-NEXT:    pcalau12i $a3, %got_pc_hi20(memset)
+; LARGE-NEXT:    addi.d $ra, $zero, %got_pc_lo12(memset)
+; LARGE-NEXT:    lu32i.d $ra, %got64_pc_lo20(memset)
+; LARGE-NEXT:    lu52i.d $ra, $ra, %got64_pc_hi12(memset)
+; LARGE-NEXT:    ldx.d $ra, $ra, $a3
 ; LARGE-NEXT:    jirl $ra, $ra, 0
 ; LARGE-NEXT:    ld.d $ra, $sp, 8 # 8-byte Folded Reload
 ; LARGE-NEXT:    addi.d $sp, $sp, 16
diff --git a/llvm/test/CodeGen/LoongArch/expand-adjacency.ll b/llvm/test/CodeGen/LoongArch/expand-adjacency.ll
index 154d2121a6321e..b00cf2c519a151 100644
--- a/llvm/test/CodeGen/LoongArch/expand-adjacency.ll
+++ b/llvm/test/CodeGen/LoongArch/expand-adjacency.ll
@@ -14,10 +14,10 @@ declare void @llvm.memset.p0.i64(ptr, i8, i64, i1)
 
 define void @call_external_sym(ptr %dst) {
 ; LARGE-LABEL: call_external_sym:
-; LARGE:         pcalau12i [[REG1:\$[a-z0-9]+]], %pc_hi20(memset)
-; LARGE-NEXT:    addi.d [[REG2:\$[a-z0-9]+]], $zero, %pc_lo12(memset)
-; LARGE-NEXT:    lu32i.d [[REG2]], %pc64_lo20(memset)
-; LARGE-NEXT:    lu52i.d [[REG2]], [[REG2]], %pc64_hi12(memset)
+; LARGE:         pcalau12i [[REG1:\$[a-z0-9]+]], %got_pc_hi20(memset)
+; LARGE-NEXT:    addi.d [[REG2:\$[a-z0-9]+]], $zero, %got_pc_lo12(memset)
+; LARGE-NEXT:    lu32i.d [[REG2]], %got64_pc_lo20(memset)
+; LARGE-NEXT:    lu52i.d [[REG2]], [[REG2]], %got64_pc_hi12(memset)
 entry:
   call void @llvm.memset.p0.i64(ptr %dst, i8 0, i64 1000, i1 false)
   ret void
diff --git a/llvm/test/CodeGen/LoongArch/machinelicm-address-pseudos.ll b/llvm/test/CodeGen/LoongArch/machinelicm-address-pseudos.ll
index ba72ef5bd7ba4b..fc0c7ad1686ee8 100644
--- a/llvm/test/CodeGen/LoongArch/machinelicm-address-pseudos.ll
+++ b/llvm/test/CodeGen/LoongArch/machinelicm-address-pseudos.ll
@@ -279,11 +279,11 @@ define void @test_la_tls_ld(i32 signext %n) {
 ; LA64LARGE-NEXT:  .LBB3_1: # %loop
 ; LA64LARGE-NEXT:    # =>This Inner Loop Header: Depth=1
 ; LA64LARGE-NEXT:    move $a0, $s0
-; LA64LARGE-NEXT:    pcalau12i $a1, %pc_hi20(__tls_get_addr)
-; LA64LARGE-NEXT:    addi.d $ra, $zero, %pc_lo12(__tls_get_addr)
-; LA64LARGE-NEXT:    lu32i.d $ra, %pc64_lo20(__tls_get_addr)
-; LA64LARGE-NEXT:    lu52i.d $ra, $ra, %pc64_hi12(__tls_get_addr)
-; LA64LARGE-NEXT:    add.d $ra, $ra, $a1
+; LA64LARGE-NEXT:    pcalau12i $a1, %got_pc_hi20(__tls_get_addr)
+; LA64LARGE-NEXT:    addi.d $ra, $zero, %got_pc_lo12(__tls_get_addr)
+; LA64LARGE-NEXT:    lu32i.d $ra, %got64_pc_lo20(__tls_get_addr)
+; LA64LARGE-NEXT:    lu52i.d $ra, $ra, %got64_pc_hi12(__tls_get_addr)
+; LA64LARGE-NEXT:    ldx.d $ra, $ra, $a1
 ; LA64LARGE-NEXT:    jirl $ra, $ra, 0
 ; LA64LARGE-NEXT:    ld.w $zero, $a0, 0
 ; LA64LARGE-NEXT:    addi.w $s1, $s1, 1
@@ -445,11 +445,11 @@ define void @test_la_tls_gd(i32 signext %n) nounwind {
 ; LA64LARGE-NEXT:  .LBB5_1: # %loop
 ; LA64LARGE-NEXT:    # =>This Inner Loop Header: Depth=1
 ; LA64LARGE-NEXT:    move $a0, $s0
-; LA64LARGE-NEXT:    pcalau12i $a1, %pc_hi20(__tls_get_addr)
-; LA64LARGE-NEXT:    addi.d $ra, $zero, %pc_lo12(__tls_get_addr)
-; LA64LARGE-NEXT:    lu32i.d $ra, %pc64_lo20(__tls_get_addr)
-; LA64LARGE-NEXT:    lu52i.d $ra, $ra, %pc64_hi12(__tls_get_addr)
-; LA64LARGE-NEXT:    add.d $ra, $ra, $a1
+; LA64LARGE-NEXT:    pcalau12i $a1, %got_pc_hi20(__tls_get_addr)
+; LA64LARGE-NEXT:    addi.d $ra, $zero, %got_pc_lo12(__tls_get_addr)
+; LA64LARGE-NEXT:    lu32i.d $ra, %got64_pc_lo20(__tls_get_addr)
+; LA64LARGE-NEXT:    lu52i.d $ra, $ra, %got64_pc_hi12(__tls_get_addr)
+; LA64LARGE-NEXT:    ldx.d $ra, $ra, $a1
 ; LA64LARGE-NEXT:    jirl $ra, $ra, 0
 ; LA64LARGE-NEXT:    ld.w $zero, $a0, 0
 ; LA64LARGE-NEXT:    addi.w $s1, $s1, 1
diff --git a/llvm/test/CodeGen/LoongArch/psabi-restricted-scheduling.ll b/llvm/test/CodeGen/LoongArch/psabi-restricted-scheduling.ll
index a7873f466bee3f..c7de3dcf2ecfd2 100644
--- a/llvm/test/CodeGen/LoongArch/psabi-restricted-scheduling.ll
+++ b/llvm/test/CodeGen/LoongArch/psabi-restricted-scheduling.ll
@@ -102,11 +102,11 @@ define void @foo() nounwind {
 ; LARGE_NO_SCH-NEXT:    lu32i.d $a1, %got64_pc_lo20(gd)
 ; LARGE_NO_SCH-NEXT:    lu52i.d $a1, $a1, %got64_pc_hi12(gd)
 ; LARGE_NO_SCH-NEXT:    add.d $a0, $a1, $a0
-; LARGE_NO_SCH-NEXT:    pcalau12i $a1, %pc_hi20(__tls_get_addr)
-; LARGE_NO_SCH-NEXT:    addi.d $ra, $zero, %pc_lo12(__tls_get_addr)
-; LARGE_NO_SCH-NEXT:    lu32i.d $ra, %pc64_lo20(__tls_get_addr)
-; LARGE_NO_SCH-NEXT:    lu52i.d $ra, $ra, %pc64_hi12(__tls_get_addr)
-; LARGE_NO_SCH-NEXT:    add.d $ra, $ra, $a1
+; LARGE_NO_SCH-NEXT:    pcalau12i $a1, %got_pc_hi20(__tls_get_addr)
+; LARGE_NO_SCH-NEXT:    addi.d $ra, $zero, %got_pc_lo12(__tls_get_addr)
+; LARGE_NO_SCH-NEXT:    lu32i.d $ra, %got64_pc_lo20(__tls_get_addr)
+; LARGE_NO_SCH-NEXT:    lu52i.d $ra, $ra, %got64_pc_hi12(__tls_get_addr)
+; LARGE_NO_SCH-NEXT:    ldx.d $ra, $ra, $a1
 ; LARGE_NO_SCH-NEXT:    jirl $ra, $ra, 0
 ; LARGE_NO_SCH-NEXT:    ld.d $zero, $a0, 0
 ; LARGE_NO_SCH-NEXT:    pcalau12i $a0, %ld_pc_hi20(ld)
@@ -114,11 +114,11 @@ define void @foo() nounwind {
 ; LARGE_NO_SCH-NEXT:    lu32i.d $a1, %got64_pc_lo20(ld)
 ; LARGE_NO_SCH-NEXT:    lu52i.d $a1, $a1, %got64_pc_hi12(ld)
 ; LARGE_NO_SCH-NEXT:    add.d $a0, $a1, $a0
-; LARGE_NO_SCH-NEXT:    pcalau12i $a1, %pc_hi20(__tls_get_addr)
-; LARGE_NO_SCH-NEXT:    addi.d $ra, $zero, %pc_lo12(__tls_get_addr)
-; LARGE_NO_SCH-NEXT:    lu32i.d $ra, %pc64_lo20(__tls_get_addr)
-; LARGE_NO_SCH-NEXT:    lu52i.d $ra, $ra, %pc64_hi12(__tls_get_addr)
-; LARGE_NO_SCH-NEXT:    add.d $ra, $ra, $a1
+; LARGE_NO_SCH-NEXT:    pcalau12i $a1, %got_pc_hi20(__tls_get_addr)
+; LARGE_NO_SCH-NEXT:    addi.d $ra, $zero, %got_pc_lo12(__tls_get_addr)
+; LARGE_NO_SCH-NEXT:    lu32i.d $ra, %got64_pc_lo20(__tls_get_addr)
+; LARGE_NO_SCH-NEXT:    lu52i.d $ra, $ra, %got64_pc_hi12(__tls_get_addr)
+; LARGE_NO_SCH-NEXT:    ldx.d $ra, $ra, $a1
 ; LARGE_NO_SCH-NEXT:    jirl $ra, $ra, 0
 ; LARGE_NO_SCH-NEXT:    ld.d $zero, $a0, 0
 ; LARGE_NO_SCH-NEXT:    pcalau12i $a0, %ie_pc_hi20(ie)
@@ -158,11 +158,11 @@ define void @foo() nounwind {
 ; LARGE_SCH-NEXT:    lu32i.d $a1, %got64_pc_lo20(gd)
 ; LARGE_SCH-NEXT:    lu52i.d $a1, $a1, %got64_pc_hi12(gd)
 ; LARGE_SCH-NEXT:    add.d $a0, $a1, $a0
-; LARGE_SCH-NEXT:    pcalau12i $a1, %pc_hi20(__tls_get_addr)
-; LARGE_SCH-NEXT:    addi.d $ra, $zero, %pc_lo12(__tls_get_addr)
-; LARGE_SCH-NEXT:    lu32i.d $ra, %pc64_lo20(__tls_get_addr)
-; LARGE_SCH-NEXT:    lu52i.d $ra, $ra, %pc64_hi12(__tls_get_addr)
-; LARGE_SCH-NEXT:    add.d $ra, $ra, $a1
+; LARGE_SCH-NEXT:    pcalau12i $a1, %got_pc_hi20(__tls_get_addr)
+; LARGE_SCH-NEXT:    addi.d $ra, $zero, %got_pc_lo12(__tls_get_addr)
+; LARGE_SCH-NEXT:    lu32i.d $ra, %got64_pc_lo20(__tls_get_addr)
+; LARGE_SCH-NEXT:    lu52i.d $ra, $ra, %got64_pc_hi12(__tls_get_addr)
+; LARGE_SCH-NEXT:    ldx.d $ra, $ra, $a1
 ; LARGE_SCH-NEXT:    jirl $ra, $ra, 0
 ; LARGE_SCH-NEXT:    ld.d $zero, $a0, 0
 ; LARGE_SCH-NEXT:    pcalau12i $a0, %ld_pc_hi20(ld)
@@ -170,11 +170,11 @@ define void @foo() nounwind {
 ; LARGE_SCH-NEXT:    lu32i.d $a1, %got64_pc_lo20(ld)
 ; LARGE_SCH-NEXT:    lu52i.d $a1, $a1, %got64_pc_hi12(ld)
 ; LARGE_SCH-NEXT:    add.d $a0, $a1, $a0
-; LARGE_SCH-NEXT:    pcalau12i $a1, %pc_hi20(__tls_get_addr)
-; LARGE_SCH-NEXT:    addi.d $ra, $zero, %pc_lo12(__tls_get_addr)
-; LARGE_SCH-NEXT:    lu32i.d $ra, %pc64_lo20(__tls_get_addr)
-; LARGE_SCH-NEXT:    lu52i.d $ra, $ra, %pc64_hi12(__tls_get_addr)
-; LARGE_SCH-NEXT:    add.d $ra, $ra, $a1
+; LARGE_SCH-NEXT:    pcalau12i $a1, %got_pc_hi20(__tls_get_addr)
+; LARGE_SCH-NEXT:    addi.d $ra, $zero, %got_pc_lo12(__tls_get_addr)
+; LARGE_SCH-NEXT:    lu32i.d $ra, %got64_pc_lo20(__tls_get_addr)
+; LARGE_SCH-NEXT:    lu52i.d $ra, $ra, %got64_pc_hi12(__tls_get_addr)
+; LARGE_SCH-NEXT:    ldx.d $ra, $ra, $a1
 ; LARGE_SCH-NEXT:    jirl $ra, $ra, 0
 ; LARGE_SCH-NEXT:    ld.d $zero, $a0, 0
 ; LARGE_SCH-NEXT:    pcalau12i $a0, %ie_pc_hi20(ie)
diff --git a/llvm/test/CodeGen/LoongArch/tls-models.ll b/llvm/test/CodeGen/LoongArch/tls-models.ll
index 4ac6201fdd9d4c..dbd7bf6a81269c 100644
--- a/llvm/test/CodeGen/LoongArch/tls-models.ll
+++ b/llvm/test/CodeGen/LoongArch/tls-models.ll
@@ -55,11 +55,11 @@ define ptr @f1() nounwind {
 ; LA64LARGEPIC-NEXT:    lu32i.d $a1, %got64_pc_lo20(unspecified)
 ; LA64LARGEPIC-NEXT:    lu52i.d $a1, $a1, %got64_pc_hi12(unspecified)
 ; LA64LARGEPIC-NEXT:    add.d $a0, $a1, $a0
-; LA64LARGEPIC-NEXT:    pcalau12i $a1, %pc_hi20(__tls_get_addr)
-; LA64LARGEPIC-NEXT:    addi.d $ra, $zero, %pc_lo12(__tls_get_addr)
-; LA64LARGEPIC-NEXT:    lu32i.d $ra, %pc64_lo20(__tls_get_addr)
-; LA64LARGEPIC-NEXT:    lu52i.d $ra, $ra, %pc64_hi12(__tls_get_addr)
-; LA64LARGEPIC-NEXT:    add.d $ra, $ra, $a1
+; LA64LARGEPIC-NEXT:    pcalau12i $a1, %got_pc_hi20(__tls_get_addr)
+; LA64LARGEPIC-NEXT:    addi.d $ra, $zero, %got_pc_lo12(__tls_get_addr)
+; LA64LARGEPIC-NEXT:    lu32i.d $ra, %got64_pc_lo20(__tls_get_addr)
+; LA64LARGEPIC-NEXT:    lu52i.d $ra, $ra, %got64_pc_hi12(__tls_get_addr)
+; LA64LARGEPIC-NEXT:    ldx.d $ra, $ra, $a1
 ; LA64LARGEPIC-NEXT:    jirl $ra, $ra, 0
 ; LA64LARGEPIC-NEXT:    ld.d $ra, $sp, 8 # 8-byte Folded Reload
 ; LA64LARGEPIC-NEXT:    addi.d $sp, $sp, 16
@@ -168,11 +168,11 @@ define ptr @f2() nounwind {
 ; LA64LARGEPIC-NEXT:    lu32i.d $a1, %got64_pc_lo20(ld)
 ; LA64LARGEPIC-NEXT:    lu52i.d $a1, $a1, %got64_pc_hi12(ld)
 ; LA64LARGEPIC-NEXT:    add.d $a0, $a1, $a0
-; LA64LARGEPIC-NEXT:    pcalau12i $a1, %pc_hi20(__tls_get_addr)
-; LA64LARGEPIC-NEXT:    addi.d $ra, $zero, %pc_lo12(__tls_get_addr)
-; LA64LARGEPIC-NEXT:    lu32i.d $ra, %pc64_lo20(__tls_get_addr)
-; LA64LARGEPIC-NEXT:    lu52i.d $ra, $ra, %pc64_hi12(__tls_get_addr)
-; LA64LARGEPIC-NEXT:    add.d $ra, $ra, $a1
+; LA64LARGEPIC-NEXT:    pcalau12i $a1, %got_pc_hi20(__tls_get_addr)
+; LA64LARGEPIC-NEXT:    addi.d $ra, $zero, %got_pc_lo12(__tls_get_addr)
+; LA64LARGEPIC-NEXT:    lu32i.d $ra, %got64_pc_lo20(__tls_get_addr)
+; LA64LARGEPIC-NEXT:    lu52i.d $ra, $ra, %got64_pc_hi12(__tls_get_addr)
+; LA64LARGEPIC-NEXT:    ldx.d $ra, $ra, $a1
 ; LA64LARGEPIC-NEXT:    jirl $ra, $ra, 0
 ; LA64LARGEPIC-NEXT:    ld.d $ra, $sp, 8 # 8-byte Folded Reload
 ; LA64LARGEPIC-NEXT:    addi.d $sp, $sp, 16

@wangleiat wangleiat requested a review from SixWeining November 21, 2024 02:17
Copy link
Member

@heiher heiher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

@wangleiat wangleiat merged commit 6377ae4 into main Nov 21, 2024
10 checks passed
@wangleiat wangleiat deleted the users/wangleiat/spr/loongarch-fix-got-usage-for-non-dso_local-function-calls-in-large-code-model branch November 21, 2024 08:52
wangleiat added a commit to wangleiat/llvm-project that referenced this pull request Nov 21, 2024
… code model

This commit fixes an issue in the large code model where non-dso_local
function calls did not use the GOT as expected in PIC mode. Instead,
direct PC-relative access was incorrectly applied, leading to linker
errors when building shared libraries.

For `ExternalSymbol`, it is not possible to determine whether it is
dso_local during pseudo-instruction expansion. We use target flags to
differentiate whether GOT should be used.

Cherry-picked from llvm#117099, used for fix linker errors when bulding
shared libraries with large code model.
tru pushed a commit to wangleiat/llvm-project that referenced this pull request Nov 25, 2024
… code model

This commit fixes an issue in the large code model where non-dso_local
function calls did not use the GOT as expected in PIC mode. Instead,
direct PC-relative access was incorrectly applied, leading to linker
errors when building shared libraries.

For `ExternalSymbol`, it is not possible to determine whether it is
dso_local during pseudo-instruction expansion. We use target flags to
differentiate whether GOT should be used.

Cherry-picked from llvm#117099, used for fix linker errors when bulding
shared libraries with large code model.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants