Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DARWIN/ARM64: Bad codegen for sign extend #39823

Closed
Keno opened this issue Feb 25, 2021 · 6 comments · Fixed by #39891
Closed

DARWIN/ARM64: Bad codegen for sign extend #39823

Keno opened this issue Feb 25, 2021 · 6 comments · Fixed by #39891
Labels
system:apple silicon Affects Apple Silicon only (Darwin/ARM64) - e.g. M1 and other M-series chips system:mac Affects only macOS

Comments

@Keno
Copy link
Member

Keno commented Feb 25, 2021

julia> function f(n)
       (n_positive, neg) = Base.split_sign(n); ccall(:jl_breakpoint, Cvoid, ())
       Base.dec(n_positive, 1, neg)
       end

LLVM IR:

;  @ REPL[9]:2 within `f'
; ┌ @ intfuncs.jl:743 within `split_sign'
; │┌ @ int.jl:172 within `abs'
; ││┌ @ int.jl:132 within `flipsign'
     %7 = ashr i32 %0, 31
     %8 = add i32 %0, %7
     %9 = xor i32 %8, %7
; │└└
; │┌ @ promotion.jl:360 within `<'
; ││┌ @ promotion.jl:291 within `promote'
; │││┌ @ promotion.jl:268 within `_promote'
; ││││┌ @ number.jl:7 within `convert'
; │││││┌ @ boot.jl:757 within `Int64'
; ││││││┌ @ boot.jl:676 within `toInt64'
         %10 = sext i32 %0 to i64

Assembly:

; │┌ @ intfuncs.jl:743 within `split_sign'
; ││┌ @ int.jl:172 within `abs'
; │││┌ @ int.jl:132 within `flipsign'
	asr	w10, w9, #31
	add	w11, w9, w10
	eor	w0, w11, w10
; ││└└
; ││┌ @ promotion.jl:360 within `<'
; │││┌ @ promotion.jl:291 within `promote'
; ││││┌ @ promotion.jl:268 within `_promote'
; │││││┌ @ number.jl:7 within `convert'
; ││││││┌ @ boot.jl:757 within `Int64'
; │││││││┌ @ boot.jl:676 within `toInt64'
	mov	x8, x9
; │││└└└└└
; │││ @ promotion.jl:360 within `<' @ int.jl:83
	cmp	x8, #0                          ; =0
	cset	w10, lt

w9 is not sign extended in the x9 register

@Keno Keno added system:mac Affects only macOS system:apple silicon Affects Apple Silicon only (Darwin/ARM64) - e.g. M1 and other M-series chips labels Feb 25, 2021
@Keno Keno mentioned this issue Feb 25, 2021
31 tasks
@Keno
Copy link
Member Author

Keno commented Feb 25, 2021

I guess another piece I should have mentioned is that it's spilling the register:

	str	w0, [sp, #8]                    ; 4-byte Folded Spill
	blr	x8
	ldr	w9, [sp, #8]                    ; 4-byte Folded Reload

If it had used the original x0, there wouldn't be an issue because the value comes in sign extended.

@Keno
Copy link
Member Author

Keno commented Feb 25, 2021

Keno added a commit to JuliaPackaging/Yggdrasil that referenced this issue Mar 1, 2021
Includes patches for
JuliaLang/julia#39823
JuliaLang/julia#39820
JuliaLang/julia#39818
as well as an issue causing an assertion in
debug mode due to address spaces.
Keno added a commit to JuliaPackaging/Yggdrasil that referenced this issue Mar 1, 2021
Includes patches for
JuliaLang/julia#39823
JuliaLang/julia#39820
JuliaLang/julia#39818
as well as an issue causing an assertion in
debug mode due to address spaces.
Keno added a commit to JuliaPackaging/Yggdrasil that referenced this issue Mar 1, 2021
Includes patches for
JuliaLang/julia#39823
JuliaLang/julia#39820
JuliaLang/julia#39818
as well as an issue causing an assertion in
debug mode due to address spaces.
Keno added a commit that referenced this issue Mar 2, 2021
Keno added a commit that referenced this issue Mar 2, 2021
@aemerson
Copy link

aemerson commented Mar 4, 2021

I commented on the LLVM bug report, but wanted to ask here as well: what's stopping adoption of the default instruction selector at -O0 (GlobalISel) over using FastISel?

@StefanKarpinski
Copy link
Member

My guess would be speed, but perhaps @Keno has a more specific reason.

@Keno
Copy link
Member Author

Keno commented Mar 23, 2021

No, there isn't any good reason other than that nobody has tried it and measured the impact.

@aemerson
Copy link

In that case I'd recommend that Julia just use the default selector, since GlobalISel is actively maintained.

vchuravy pushed a commit that referenced this issue Mar 30, 2021
Fixes #39818
Fixes #39820
Fixes #39823

(cherry picked from commit 0988fcf)
KristofferC pushed a commit that referenced this issue Apr 4, 2021
Fixes #39818
Fixes #39820
Fixes #39823

(cherry picked from commit 0988fcf)
KristofferC pushed a commit that referenced this issue Apr 4, 2021
Fixes #39818
Fixes #39820
Fixes #39823

(cherry picked from commit 0988fcf)
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this issue May 4, 2021
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this issue May 9, 2021
staticfloat pushed a commit that referenced this issue Dec 23, 2022
Fixes #39818
Fixes #39820
Fixes #39823

(cherry picked from commit 0988fcf)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
system:apple silicon Affects Apple Silicon only (Darwin/ARM64) - e.g. M1 and other M-series chips system:mac Affects only macOS
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants