Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LateGCLowering] Fix skipped Select lifting #35387

Merged
merged 1 commit into from
Apr 8, 2020
Merged

[LateGCLowering] Fix skipped Select lifting #35387

merged 1 commit into from
Apr 8, 2020

Conversation

Keno
Copy link
Member

@Keno Keno commented Apr 7, 2020

Fixes #35341. This function was skipping the lift of the select if one of the bases was null. In this situation, the caller would assign it Number -1, which means globally rooted and is obviously wrong here. I'm not entirely sure why that code was added, but perhaps @vtjnash could comment what the intention was and perhaps we can maintain whatever optimization that was intended to perform (even if in practice all it did was prevent rooting of select results).

@Keno Keno requested a review from vtjnash April 7, 2020 03:10
@JeffBezanson JeffBezanson added bugfix This change fixes an existing bug multithreading Base.Threads and related functionality compiler:codegen Generation of LLVM IR and native code labels Apr 7, 2020
@chriselrod
Copy link
Contributor

I checked out this PR, and still get frequent hangs. @tkf , can you try?

@Keno
Copy link
Member Author

Keno commented Apr 7, 2020

There may be more than one issue. I'll take rr traces to debug ;).

@tkf
Copy link
Member

tkf commented Apr 7, 2020

I've been running four instances of ASAN_OPTIONS=detect_leaks=0:allow_user_segv_handler=1:abort_on_error=1 ASAN_SYMBOLIZER_PATH=$tooldir/llvm-symbolizer usr/bin/julia-debug -e 'using BenchmarkTools, ThreadsX, Random; for seed in 1:1000; @btime ThreadsX.sort($(rand(MersenneTwister(@show(seed)), 0:0.01:1, 1_000_000))); end' for a while now (more than 600 iterations) and there is no failures. So I think this PR fixes the memory bug related to #35341.

@chriselrod Thanks a lot for verifying the original problem. I guess the next step is to apply this PR to pre-#32599 code and then check if the hang or segfault happens. If not, we then know some parts of #32599 introduced the problem... I'll try it later.

@chriselrod
Copy link
Contributor

Because you seem already set up, I'm hoping it'd not be too much trouble for you to run another example.

Here is some code that seems to reproduce the problem fairly reliably on my machine, without dependencies outside base or the standard library, other than BenchmarkTools (although a foreach would probably work too):

using Base.Threads: @spawn, @sync, nthreads
using LinearAlgebra, BenchmarkTools

function tproduct_fm(x)
    @fastmath begin
        X3 = cbrt.(x)
        X9 = cbrt.(X3).^2
    end
    lx = length(x)
    H = Matrix{eltype(x)}(undef, lx, lx)
    @sync for x_index_chunk in Iterators.partition(eachindex(x), lx ÷ 2nthreads()) 
        @spawn begin
            for i  x_index_chunk
                x3 = X3[i]
                x9 = X9[i]
                js = firstindex(x):i
                       @fastmath for j in js
                           A = x3 + X3[j]
                           B = x3 * X3[j]
                           C = x9 + X9[j]
                           f = A^2 * C
                           g = -(B*A)
                           H[j, i] = f * g
                       end
            end
        end
    end
    return Symmetric(H)
end
x = rand(1000);
@benchmark tproduct_fm($x)

Hopefully all the issues will get fixed soon, now that the two of you seem well set up to find them.

tkf, what does it take to get everything set up to create the rr traces?

@Keno Keno merged commit 7a4ea21 into master Apr 8, 2020
@Keno Keno deleted the kf/35341 branch April 8, 2020 18:55
@tkf
Copy link
Member

tkf commented Apr 9, 2020

@Keno According to #35341 (comment), this bug is due to #33389 which seems to be included in 1.4 release. Does it make sense to backport this to 1.4?

@Keno
Copy link
Member Author

Keno commented Apr 9, 2020

yes

@tkf
Copy link
Member

tkf commented Apr 9, 2020

@Keno I have a few questions:

@chriselrod If the answer to the last two questions are no, I think you can just install rr and run rr record --chaos ./julia -e 'using BenchmarkTools, ThreadsX, Random; for seed in 1:100; @btime ThreadsX.sort($(rand(MersenneTwister(@show(seed)), 0:0.01:1, 1_000_000))); end' (or run the code/script you just mentioned).

Also, if @Keno needs julia-debug, I think the first step for this direction is to make sure we can get the deadlock with julia-debug. (I'm naively assuming that julia-debug has some overhead that may alter how tasks are scheduled.)

@Keno
Copy link
Member Author

Keno commented Apr 9, 2020

  • Do you need rr trace with julia-debug? If we can't reproduce the bug with julia-debug, does it still make sense to rr record on plain julia?

Either is fine. julia-debug is marginally easier for me to debug.

Does it make sense to use the default LLVM 9 if we are not using ASAN + rr? Is it still better to use LLVM 10? IIUC that was for rr+ASAN. #35341 (comment)

Whatever reproduces the bug is fine.

Does it still make sense to use rr master? Especially to include your patch rr-debugger/rr#2488. I don't know if it was ASAN-specific.

Yes, please.

I'll have an updated rr jll shortly, which will simply the installation.

@tkf
Copy link
Member

tkf commented Apr 9, 2020

I'll have an updated rr jll shortly

Wow. I didn't know that I can install rr via jll. This is great.

KristofferC pushed a commit that referenced this pull request Apr 10, 2020
@KristofferC KristofferC mentioned this pull request Apr 10, 2020
27 tasks
ztultrebor pushed a commit to ztultrebor/julia that referenced this pull request Apr 17, 2020
staticfloat pushed a commit that referenced this pull request Apr 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This change fixes an existing bug compiler:codegen Generation of LLVM IR and native code multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rare deadlock and segmentation fault in a parallel quicksort implementation (with an MWE)
5 participants