threading: fixup scheduler statepoints for GC #32238

JeffBezanson · 2019-06-04T03:04:05Z

We basically did not have a GC safepoint in some paths that could lead to a thread blocking. This should fix the issue noticed in #32217 (comment). Commits can be squashed.

vtjnash

@yuyichao is this right/sufficient?

yuyichao · 2019-06-04T08:39:26Z

Which one is the fix? Lockwise there seems to be fewer safepoints in the loop?

vtjnash · 2019-06-04T17:52:00Z

There's now a statepoint (well, two) in every get_task_next call, instead of just one at the top of the loop. The missing one per the issue comment is the one just before the blocking uv_run call. The other part of the fix is ensuring that we don't run the Julia code callback from inside the region marked gc-safe.

vtjnash · 2019-06-04T17:54:52Z

src/partr.c

@@ -481,16 +484,18 @@ JL_DLLEXPORT jl_task_t *jl_task_get_next(jl_value_t *getsticky)
                }
            }
            // the other threads will just wait for on signal to resume
-            int8_t gc_state = jl_gc_safe_enter(ptls);
            uv_mutex_lock(&ptls->sleep_lock);
            while (jl_atomic_load(&sleep_check_state) == sleeping) {
                task = get_next_task(getsticky);


This call also can't happen while holding the sleep lock (because it contains gc allocations and/or safepoints) and/or we cannot attempt to wake up all the other threads to get them to help with the gc effort.

OK, in this case assuming get_next_task can call julia code/trigger GC then the uv_mutex_lock(&ptls->sleep_lock); must be done in GC safe region. Alternatively, all holders of the lock must not trigger any GC.

It makes sense to me that we ought to be able to check for available tasks without triggering GC. Of course, that's hard to guarantee from julia code.

yuyichao · 2019-06-04T18:18:24Z

src/partr.c

@@ -481,16 +484,18 @@ JL_DLLEXPORT jl_task_t *jl_task_get_next(jl_value_t *getsticky)
                }
            }
            // the other threads will just wait for on signal to resume
-            int8_t gc_state = jl_gc_safe_enter(ptls);
            uv_mutex_lock(&ptls->sleep_lock);
            while (jl_atomic_load(&sleep_check_state) == sleeping) {
                task = get_next_task(getsticky);


OK, in this case assuming get_next_task can call julia code/trigger GC then the uv_mutex_lock(&ptls->sleep_lock); must be done in GC safe region. Alternatively, all holders of the lock must not trigger any GC.

vtjnash and others added 2 commits June 3, 2019 17:06

threading: fixup scheduler statepoints for GC

851e2b1

unlock sleep mutex before gc safepoint

1e2599f

JeffBezanson added multithreading Base.Threads and related functionality GC Garbage collector bugfix This change fixes an existing bug labels Jun 4, 2019

vtjnash approved these changes Jun 4, 2019

View reviewed changes

vtjnash reviewed Jun 4, 2019

View reviewed changes

yuyichao requested changes Jun 4, 2019

View reviewed changes

fix thread-sleep safepoint, optimize [ci skip]

b72cdba

JeffBezanson mentioned this pull request Jun 7, 2019

make GC counters thread-local #32217

Merged

vtjnash added 2 commits June 7, 2019 17:06

fixup! fix thread-sleep safepoint, optimize [ci skip]

1bf4e07

fixup! fix thread-sleep safepoint, optimize [ci skip]

c574042

JeffBezanson merged commit aaaa6a1 into master Jun 10, 2019

JeffBezanson deleted the jn/32217 branch June 10, 2019 16:24

JeffBezanson mentioned this pull request Jun 13, 2019

Multithreading stalls if using more threads than cores #32258

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

threading: fixup scheduler statepoints for GC #32238

threading: fixup scheduler statepoints for GC #32238

JeffBezanson commented Jun 4, 2019

vtjnash left a comment

yuyichao commented Jun 4, 2019

vtjnash commented Jun 4, 2019

vtjnash Jun 4, 2019

yuyichao Jun 4, 2019

JeffBezanson Jun 4, 2019

yuyichao Jun 4, 2019

threading: fixup scheduler statepoints for GC #32238

threading: fixup scheduler statepoints for GC #32238

Conversation

JeffBezanson commented Jun 4, 2019

vtjnash left a comment

Choose a reason for hiding this comment

yuyichao commented Jun 4, 2019

vtjnash commented Jun 4, 2019

vtjnash Jun 4, 2019

Choose a reason for hiding this comment

yuyichao Jun 4, 2019

Choose a reason for hiding this comment

JeffBezanson Jun 4, 2019

Choose a reason for hiding this comment

yuyichao Jun 4, 2019

Choose a reason for hiding this comment