forked from JuliaLang/julia
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mapreducedim init adienes testing #1
Draft
adienes
wants to merge
21
commits into
master
Choose a base branch
from
mapreducedim-init-adienes-testing
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
+638
−321
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
the custom allocators are needed for Sparse Arrays
adienes
pushed a commit
that referenced
this pull request
Feb 12, 2025
At some point LLVM on MacOS started doing frame pointer optimization by default. We should ask for a frame pointer on every function, on all platforms. Prior to this change, on `1.11.3+0.aarch64.apple.darwin14`: ``` julia> @code_native ((x,y) -> Core.Intrinsics.add_float(x,y))(1.0,2.0) .section __TEXT,__text,regular,pure_instructions .build_version macos, 15, 0 .globl "_julia_#1_678" ; -- Begin function julia_#1_678 .p2align 2 "_julia_#1_678": ; @"julia_#1_678" ; Function Signature: var"#1"(Float64, Float64) ; ┌ @ REPL[1]:1 within `#1` ; %bb.0: ; %top ; │ @ REPL[1] within `#1` ;DEBUG_VALUE: #1:x <- $d0 ;DEBUG_VALUE: #1:x <- $d0 ;DEBUG_VALUE: #1:y <- $d1 ;DEBUG_VALUE: #1:y <- $d1 ; │ @ REPL[1]:1 within `#1` fadd d0, d0, d1 ret ; └ ; -- End function .section __DATA,__const .p2align 3, 0x0 ; @"+Core.Float64#680" "l_+Core.Float64#680": .quad "l_+Core.Float64#680.jit" .set "l_+Core.Float64#680.jit", 5490712608 .subsections_via_symbols ``` Prior to this change, on `1.11.3+0.aarch64.linux.gnu`: ``` julia> @code_native ((x,y) -> Core.Intrinsics.add_float(x,y))(1.0,2.0) .text .file "#1" .globl "julia_#1_656" // -- Begin function julia_#1_656 .p2align 2 .type "julia_#1_656",@function "julia_#1_656": // @"julia_#1_656" ; Function Signature: var"#1"(Float64, Float64) ; ┌ @ REPL[1]:1 within `#1` // %bb.0: // %top ; │ @ REPL[1] within `#1` //DEBUG_VALUE: #1:x <- $d0 //DEBUG_VALUE: #1:x <- $d0 //DEBUG_VALUE: #1:y <- $d1 //DEBUG_VALUE: #1:y <- $d1 stp x29, x30, [sp, #-16]! // 16-byte Folded Spill mov x29, sp ; │ @ REPL[1]:1 within `#1` fadd d0, d0, d1 ldp x29, x30, [sp], JuliaLang#16 // 16-byte Folded Reload ret .Lfunc_end0: .size "julia_#1_656", .Lfunc_end0-"julia_#1_656" ; └ // -- End function .type ".L+Core.Float64#658",@object // @"+Core.Float64#658" .section .rodata,"a",@progbits .p2align 3, 0x0 ".L+Core.Float64#658": .xword ".L+Core.Float64#658.jit" .size ".L+Core.Float64#658", 8 .set ".L+Core.Float64#658.jit", 278205186835760 .size ".L+Core.Float64#658.jit", 8 .section ".note.GNU-stack","",@progbits ```
adienes
pushed a commit
that referenced
this pull request
Feb 26, 2025
…aLang#55600) As an application of JuliaLang#55545, this commit avoids the insertion of `:throw_undef_if_not` nodes when the defined-ness of a slot is guaranteed by abstract interpretation. ```julia julia> function isdefined_nothrow(c, x) local val if c val = x end if @isdefined val return val end return zero(Int) end; julia> @code_typed isdefined_nothrow(true, 42) ``` ```diff diff --git a/old b/new index c4980a5c9c..3d1d6d30f0 100644 --- a/old +++ b/new @@ -4,7 +4,6 @@ CodeInfo( 3 ┄ %3 = φ (JuliaLang#2 => x, #1 => #undef)::Int64 │ %4 = φ (JuliaLang#2 => true, #1 => false)::Bool └── goto JuliaLang#5 if not %4 -4 ─ $(Expr(:throw_undef_if_not, :val, :(%4)))::Any -└── return %3 +4 ─ return %3 5 ─ return 0 ) => Int64 ```
adienes
pushed a commit
that referenced
this pull request
Feb 26, 2025
Prior to this, especially on macOS, the gc-safepoint here would cause the process to segfault as we had already freed the current_task state. Rearrange this code so that the GC interactions (except for the atomic store to current_task) are all handled before entering GC safe, and then signaling the thread is deleted (via setting current_task = NULL, published by jl_unlock_profile_wr to other threads) is last. ``` ERROR: Exception handler triggered on unmanaged thread. Process 53827 stopped * thread JuliaLang#5, stop reason = EXC_BAD_ACCESS (code=2, address=0x100018008) frame #0: 0x0000000100b74344 libjulia-internal.1.12.0.dylib`jl_delete_thread [inlined] jl_gc_state_set(ptls=0x000000011f8b3200, state='\x02', old_state=<unavailable>) at julia_threads.h:272:9 [opt] 269 assert(old_state != JL_GC_CONCURRENT_COLLECTOR_THREAD); 270 jl_atomic_store_release(&ptls->gc_state, state); 271 if (state == JL_GC_STATE_UNSAFE || old_state == JL_GC_STATE_UNSAFE) -> 272 jl_gc_safepoint_(ptls); 273 return old_state; 274 } 275 STATIC_INLINE int8_t jl_gc_state_save_and_set(jl_ptls_t ptls, Target 0: (julia) stopped. (lldb) up frame #1: 0x0000000100b74320 libjulia-internal.1.12.0.dylib`jl_delete_thread [inlined] jl_gc_state_save_and_set(ptls=0x000000011f8b3200, state='\x02') at julia_threads.h:278:12 [opt] 275 STATIC_INLINE int8_t jl_gc_state_save_and_set(jl_ptls_t ptls, 276 int8_t state) 277 { -> 278 return jl_gc_state_set(ptls, state, jl_atomic_load_relaxed(&ptls->gc_state)); 279 } 280 #ifdef __clang_gcanalyzer__ 281 // these might not be a safepoint (if they are no-op safe=>safe transitions), but we have to assume it could be (statically) (lldb) frame JuliaLang#2: 0x0000000100b7431c libjulia-internal.1.12.0.dylib`jl_delete_thread(value=0x000000011f8b3200) at threading.c:537:11 [opt] 534 ptls->root_task = NULL; 535 jl_free_thread_gc_state(ptls); 536 // then park in safe-region -> 537 (void)jl_gc_safe_enter(ptls); 538 } ``` (test incorporated into JuliaLang#55793)
adienes
pushed a commit
that referenced
this pull request
Feb 26, 2025
Rebase and extension of @alexfanqi's initial work on porting Julia to RISC-V. Requires LLVM 19. Tested on a VisionFive2, built with: ```make MARCH := rv64gc_zba_zbb MCPU := sifive-u74 USE_BINARYBUILDER:=0 DEPS_GIT = llvm override LLVM_VER=19.1.1 override LLVM_BRANCH=julia-release/19.x override LLVM_SHA1=julia-release/19.x ``` ```julia-repl ❯ ./julia _ _ _ _(_)_ | Documentation: https://docs.julialang.org (_) | (_) (_) | _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help. | | | | | | |/ _` | | | | |_| | | | (_| | | Version 1.12.0-DEV.1374 (2024-10-14) _/ |\__'_|_|_|\__'_| | riscv/25092a3982* (fork: 1 commits, 0 days) |__/ | julia> versioninfo(; verbose=true) Julia Version 1.12.0-DEV.1374 Commit 25092a3* (2024-10-14 09:57 UTC) Platform Info: OS: Linux (riscv64-unknown-linux-gnu) uname: Linux 6.11.3-1-riscv64 #1 SMP Debian 6.11.3-1 (2024-10-10) riscv64 unknown CPU: unknown: speed user nice sys idle irq #1 1500 MHz 922 s 0 s 265 s 160953 s 0 s JuliaLang#2 1500 MHz 457 s 0 s 280 s 161521 s 0 s JuliaLang#3 1500 MHz 452 s 0 s 270 s 160911 s 0 s JuliaLang#4 1500 MHz 638 s 15 s 301 s 161340 s 0 s Memory: 7.760246276855469 GB (7474.08203125 MB free) Uptime: 16260.13 sec Load Avg: 0.25 0.23 0.1 WORD_SIZE: 64 LLVM: libLLVM-19.1.1 (ORCJIT, sifive-u74) Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores) Environment: HOME = /home/tim PATH = /home/tim/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/games TERM = xterm-256color julia> ccall(:jl_dump_host_cpu, Nothing, ()) CPU: sifive-u74 Features: +zbb,+d,+i,+f,+c,+a,+zba,+m,-zvbc,-zksed,-zvfhmin,-zbkc,-zkne,-zksh,-zfh,-zfhmin,-zknh,-v,-zihintpause,-zicboz,-zbs,-zvknha,-zvksed,-zfa,-ztso,-zbc,-zvknhb,-zihintntl,-zknd,-zvbb,-zbkx,-zkt,-zvkt,-zicond,-zvksh,-zvfh,-zvkg,-zvkb,-zbkb,-zvkned julia> @code_native debuginfo=:none 1+2. .text .attribute 4, 16 .attribute 5, "rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zmmul1p0_zba1p0_zbb1p0" .file "+" .globl "julia_+_3003" .p2align 1 .type "julia_+_3003",@function "julia_+_3003": addi sp, sp, -16 sd ra, 8(sp) sd s0, 0(sp) addi s0, sp, 16 fcvt.d.l fa5, a0 ld ra, 8(sp) ld s0, 0(sp) fadd.d fa0, fa5, fa0 addi sp, sp, 16 ret .Lfunc_end0: .size "julia_+_3003", .Lfunc_end0-"julia_+_3003" .type ".L+Core.Float64#3005",@object .section .data.rel.ro,"aw",@progbits .p2align 3, 0x0 ".L+Core.Float64#3005": .quad ".L+Core.Float64#3005.jit" .size ".L+Core.Float64#3005", 8 .set ".L+Core.Float64#3005.jit", 272467692544 .size ".L+Core.Float64#3005.jit", 8 .section ".note.GNU-stack","",@progbits ``` Lots of bugs guaranteed, but with this we at least have a functional build and REPL for further development by whoever is interested. Also requires Linux 6.4+, since the fallback processor detection used here relies on LLVM's `sys::getHostCPUFeatures`, which for RISC-V is implemented using hwprobe introduced in 6.4. We could probably add a fallback that parses `/proc/cpuinfo`, either by building a CPU database much like how we've done for AArch64, or by parsing the actual ISA string contained there. That would probably also be a good place to add support for profiles, which are supposedly the way forward to package RISC-V binaries. That can happen in follow-up PRs though. For now, on older kernels, use the `-C` arg to Julia to specify an ISA. Co-authored-by: Alex Fan <alex.fan.q@gmail.com>
adienes
pushed a commit
that referenced
this pull request
Feb 26, 2025
…uliaLang#56300) The pipeline-prints test currently fails when running on an aarch64-macos device: ``` /Users/tim/Julia/src/julia/test/llvmpasses/pipeline-prints.ll:309:23: error: AFTERVECTORIZATION: expected string not found in input ; AFTERVECTORIZATION: vector.body ^ <stdin>:2:40: note: scanning from here ; *** IR Dump Before AfterVectorizationMarkerPass on julia_f_199 *** ^ <stdin>:47:27: note: possible intended match here ; *** IR Dump Before AfterVectorizationMarkerPass on jfptr_f_200 *** ^ Input file: <stdin> Check file: /Users/tim/Julia/src/julia/test/llvmpasses/pipeline-prints.ll -dump-input=help explains the following input dump. Input was: <<<<<< 1: opt: WARNING: failed to create target machine for 'x86_64-unknown-linux-gnu': unable to get target for 'x86_64-unknown-linux-gnu', see --version and --triple. 2: ; *** IR Dump Before AfterVectorizationMarkerPass on julia_f_199 *** check:309'0 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found 3: define i64 @julia_f_199(ptr addrspace(10) noundef nonnull align 16 dereferenceable(40) %0) #0 !dbg !4 { check:309'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4: top: check:309'0 ~~~~~ 5: %1 = call ptr @julia.get_pgcstack() check:309'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 6: %ptls_field = getelementptr inbounds ptr, ptr %1, i64 2 check:309'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 7: %ptls_load45 = load ptr, ptr %ptls_field, align 8, !tbaa !8 check:309'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ . . . 42: check:309'0 ~ 43: L41: ; preds = %L41.loopexit, %L17, %top check:309'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 44: %value_phi10 = phi i64 [ 0, %top ], [ %7, %L17 ], [ %.lcssa, %L41.loopexit ] check:309'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 45: ret i64 %value_phi10, !dbg !52 check:309'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 46: } check:309'0 ~~ 47: ; *** IR Dump Before AfterVectorizationMarkerPass on jfptr_f_200 *** check:309'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ check:309'1 ? possible intended match 48: ; Function Attrs: noinline optnone check:309'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 49: define nonnull ptr addrspace(10) @jfptr_f_200(ptr addrspace(10) %0, ptr noalias nocapture noundef readonly %1, i32 %2) #1 { check:309'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 50: top: check:309'0 ~~~~~ 51: %3 = call ptr @julia.get_pgcstack() check:309'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 52: %4 = getelementptr inbounds ptr addrspace(10), ptr %1, i32 0 check:309'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ . . . >>>>>> -- ******************** Failed Tests (1): Julia :: pipeline-prints.ll ``` The problem is that these tests assume x86_64, which fails because the target isn't available, so it presumably uses the native target which has different vectorization characteristics: ``` ❯ ./usr/tools/opt --load-pass-plugin=libjulia-codegen.dylib -passes='julia' --print-before=AfterVectorization -o /dev/null ../../test/llvmpasses/pipeline-prints.ll ./usr/tools/opt: WARNING: failed to create target machine for 'x86_64-unknown-linux-gnu': unable to get target for 'x86_64-unknown-linux-gnu', see --version and --triple. ``` There's other tests that assume this (e.g. the `fma` cpufeatures one), but they don't fail, so I've left them as is.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.