Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Darwin/ARM64: Linker corrupts generated code #39820

Closed
Keno opened this issue Feb 25, 2021 · 10 comments · Fixed by #39891
Closed

Darwin/ARM64: Linker corrupts generated code #39820

Keno opened this issue Feb 25, 2021 · 10 comments · Fixed by #39891
Labels
system:apple silicon Affects Apple Silicon only (Darwin/ARM64) - e.g. M1 and other M-series chips system:mac Affects only macOS upstream The issue is with an upstream dependency, e.g. LLVM

Comments

@Keno
Copy link
Member

Keno commented Feb 25, 2021

LLVM emits the following:

Lloh1699:
        adrp    x8, __MergedGlobals.1@PAGE
Lloh1700:
        add     x9, x8, __MergedGlobals.1@PAGEOFF
Ltmp3214:
        .loc    2 87 0
        add     x8, x22, #1
Ltmp3215:
        .loc    14 197 0
        ldr     x10, [x21, #8]
Ltmp3216:
        .loc    2 444 0
Lloh1701:
        ldr     x9, [x9]

LOH is what's called Linker-optimization-hint which is an aarch64-specific thing where the linker patches out some instructions (an explanation of how that works is here: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64CollectLOH.cpp). However, after linking the instruction sequence is:

    0x108017f4c <+1004>: adrp   x8, 23520
    0x108017f50 <+1008>: nop    
    0x108017f54 <+1012>: add    x8, x22, #0x1             ; =0x1 
    0x108017f58 <+1016>: ldr    x10, [x21, #0x8]
->  0x108017f5c <+1020>: ldr    x9, [x8, #0x740]

looks to me like what happened here is that the relocation offset overflowed the size of the available immediate operand of the subsequent load, causing it to overflow into the source register field, causing crashes.

@Keno Keno added system:apple silicon Affects Apple Silicon only (Darwin/ARM64) - e.g. M1 and other M-series chips system:mac Affects only macOS upstream The issue is with an upstream dependency, e.g. LLVM labels Feb 25, 2021
@Keno Keno mentioned this issue Feb 25, 2021
31 tasks
@Keno
Copy link
Member Author

Keno commented Feb 25, 2021

For posterity, the linker in question here is ld64-607.2. I'm gonna check if this is fixed in a newer linker version.

@Keno
Copy link
Member Author

Keno commented Feb 25, 2021

Actually, the linker is fine here. The issue is the intervening assignment to x8, so the compiler shouldn't have asked for an loh.

@Keno
Copy link
Member Author

Keno commented Feb 25, 2021

Looks like the fix in https://reviews.llvm.org/D80834 needs to be extended to the adrp/add/ldr case also.

@aemerson
Copy link

Do you have an IR reproducer? I tried to re-construct the issue by using this MIR:

--- |
  @sym = local_unnamed_addr global i32 zeroinitializer, align 8

  define i32 @main() {
    ret i32 0
  }

...
---
name:            main
alignment:       4
tracksRegLiveness: true
liveins:
  - { reg: '$x22', virtual-reg: '' }
  - { reg: '$x21', virtual-reg: '' }
body:             |
  bb.0:
    liveins: $x21, $x22
    renamable $x8 = ADRP target-flags(aarch64-page) @sym
    renamable $x9 = ADDXri killed renamable $x8, target-flags(aarch64-pageoff, aarch64-nc) @sym, 0
    renamable $x8 = ADDXri killed renamable $x22, 1, 0
    $x10 = LDRXui $x21, 8
    $x9 = LDRXui $x9, 0
    RET undef $lr

...

but linking this I get a binary that has the last load becoming a literal load, which seems correct:

	nop
	nop
	add	x8, x22, #0x1
	ldr	x10, [x21, #0x40]       ; Latency: 4
	ldr	x9, #0x50               ; Latency: 4
	ret

@Keno
Copy link
Member Author

Keno commented Feb 25, 2021

Can you try making the address of @sym more than a page away from the pc of the load instruction? In the original example above, the linker had to retain the adrp, but could fold the offset into the subsequent load. I'll generate you an end-to-end reproducer in the meantime.

@Keno
Copy link
Member Author

Keno commented Feb 25, 2021

@aemerson Here's a full end-to-end example for you: https://gist.github.com/Keno/97c670bb659866ed1925081624fa1fb6. The .s is generated from the .ll in the gist using the command at the top (this is on LLVM11, but the issue reproduced on trunk also, though I didn't check this particular test case). You can see the problematic sequence here: https://gist.github.com/Keno/97c670bb659866ed1925081624fa1fb6#file-extract-loh-s-L476-L488

@Keno
Copy link
Member Author

Keno commented Feb 25, 2021

(Note this is only extracting the one function. If that turns out to not be big enough to reproduce this fully, I can send you the whole .bc file. It's about 100MB though.

@aemerson
Copy link

Ok, adding some very large padding in the assembly seems to have reproduced it. I now see:

_main:
	adrp	x8, 293 ; 0x10012c000
	nop
	add	x8, x22, #0x1
	ldr	x10, [x21, #0x40]       ; Latency: 4
	ldr	x9, [x8, #0xf80]        ; Latency: 4
	ret

@Keno
Copy link
Member Author

Keno commented Feb 25, 2021

Yep, that's the issue. The x8 will have been clobbered by the intervening add x8, x22, #0x1, so that'll crash at runtime.

@aemerson
Copy link

Keno added a commit to JuliaPackaging/Yggdrasil that referenced this issue Mar 1, 2021
Includes patches for
JuliaLang/julia#39823
JuliaLang/julia#39820
JuliaLang/julia#39818
as well as an issue causing an assertion in
debug mode due to address spaces.
aemerson added a commit to llvm/llvm-project that referenced this issue Mar 1, 2021
…bber of the

def of the adrp before the ldr.

Apparently this pass used to have liveness analysis but it was removed for
scompile time reasons. This workaround prevents the LOH from being emitted
unless the ADD and LDR are adjacent.

Fixes JuliaLang/julia#39820

Differential Revision: https://reviews.llvm.org/D97571
Keno added a commit to JuliaPackaging/Yggdrasil that referenced this issue Mar 1, 2021
Includes patches for
JuliaLang/julia#39823
JuliaLang/julia#39820
JuliaLang/julia#39818
as well as an issue causing an assertion in
debug mode due to address spaces.
Keno added a commit to JuliaPackaging/Yggdrasil that referenced this issue Mar 1, 2021
Includes patches for
JuliaLang/julia#39823
JuliaLang/julia#39820
JuliaLang/julia#39818
as well as an issue causing an assertion in
debug mode due to address spaces.
Keno added a commit that referenced this issue Mar 2, 2021
Keno added a commit that referenced this issue Mar 2, 2021
morehouse pushed a commit to morehouse/llvm-project that referenced this issue Mar 4, 2021
…bber of the

def of the adrp before the ldr.

Apparently this pass used to have liveness analysis but it was removed for
scompile time reasons. This workaround prevents the LOH from being emitted
unless the ADD and LDR are adjacent.

Fixes JuliaLang/julia#39820

Differential Revision: https://reviews.llvm.org/D97571
vchuravy pushed a commit that referenced this issue Mar 30, 2021
Fixes #39818
Fixes #39820
Fixes #39823

(cherry picked from commit 0988fcf)
KristofferC pushed a commit that referenced this issue Apr 4, 2021
Fixes #39818
Fixes #39820
Fixes #39823

(cherry picked from commit 0988fcf)
KristofferC pushed a commit that referenced this issue Apr 4, 2021
Fixes #39818
Fixes #39820
Fixes #39823

(cherry picked from commit 0988fcf)
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this issue May 4, 2021
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this issue May 9, 2021
vchuravy pushed a commit to JuliaLang/llvm-project that referenced this issue Oct 1, 2021
…e def of the adrp before the ldr.

Apparently this pass used to have liveness analysis but it was removed for scompile time reasons. This workaround prevents the LOH from being emitted unless the ADD and LDR are adjacent.

Fixes JuliaLang/julia#39820

Reviewed By: loladiro, qcolombet

Differential Revision: https://reviews.llvm.org/D97571
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
…bber of the

def of the adrp before the ldr.

Apparently this pass used to have liveness analysis but it was removed for
scompile time reasons. This workaround prevents the LOH from being emitted
unless the ADD and LDR are adjacent.

Fixes JuliaLang/julia#39820

Differential Revision: https://reviews.llvm.org/D97571
staticfloat pushed a commit that referenced this issue Dec 23, 2022
Fixes #39818
Fixes #39820
Fixes #39823

(cherry picked from commit 0988fcf)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
system:apple silicon Affects Apple Silicon only (Darwin/ARM64) - e.g. M1 and other M-series chips system:mac Affects only macOS upstream The issue is with an upstream dependency, e.g. LLVM
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants