Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

staticdata: Close data race after backedge insertion #57229

Merged
merged 2 commits into from
Feb 6, 2025
Merged

Conversation

Keno
Copy link
Member

@Keno Keno commented Feb 1, 2025

Addresses review comment in #57212 (comment). The key is that the hand-off of responsibility for verification between the loading code and the ordinary backedge mechanism happens under the world counter lock to ensure synchronization.

@Keno Keno requested a review from vtjnash February 1, 2025 00:15
@Keno Keno force-pushed the kf/staticdatarace branch 2 times, most recently from bbea4fc to fc65ba4 Compare February 1, 2025 00:46
@oscardssmith oscardssmith added multithreading Base.Threads and related functionality bugfix This change fixes an existing bug labels Feb 1, 2025
Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems great. Maybe @aviatesk would have some time to make this same correction to Compiler.finish! also now? Since that similarly should also hold this same lock (by calling jl_promote_ci_to_current) instead of its current implementation strategy of relying on there being only a small window of time for this race that occurs.

@aviatesk
Copy link
Member

aviatesk commented Feb 3, 2025

This seems great. Maybe @aviatesk would have some time to make this same correction to Compiler.finish! also now? Since that similarly should also hold this same lock (by calling jl_promote_ci_to_current) instead of its current implementation strategy of relying on there being only a small window of time for this race that occurs.

I will work on it.

aviatesk added a commit that referenced this pull request Feb 3, 2025
Similar to #57229, this commit ensures that
`Compiler.finish!` properly synchronizes the operations to set
`max_world` for cached `CodeInstance`s by holding the world counter
lock. Previously, `Compiler.finish!` relied on a narrow timing window to
avoid race conditions, which was not a robust approach in a concurrent
execution environment.

This change ensures that `Compiler.finish!` holds the appropriate lock
(via `jl_promote_ci_to_current`).
aviatesk added a commit that referenced this pull request Feb 4, 2025
Similar to #57229, this commit ensures that
`Compiler.finish!` properly synchronizes the operations to set
`max_world` for cached `CodeInstance`s by holding the world counter
lock. Previously, `Compiler.finish!` relied on a narrow timing window to
avoid race conditions, which was not a robust approach in a concurrent
execution environment.

This change ensures that `Compiler.finish!` holds the appropriate lock
(via `jl_promote_ci_to_current`).
@Keno Keno added the backport 1.12 Change should be backported to release-1.12 label Feb 4, 2025
@KristofferC KristofferC mentioned this pull request Feb 4, 2025
32 tasks
Addresses review comment in #57212 (comment).
The key is that the hand-off of responsibility for verification
between the loading code and the ordinary backedge mechanism happens
under the world counter lock to ensure synchronization.
@Keno Keno force-pushed the kf/staticdatarace branch from fc65ba4 to 16f48a0 Compare February 6, 2025 02:40
Co-authored-by: Jameson Nash <vtjnash@gmail.com>
@vtjnash vtjnash merged commit 34aceb5 into master Feb 6, 2025
7 checks passed
@vtjnash vtjnash deleted the kf/staticdatarace branch February 6, 2025 19:31
aviatesk added a commit that referenced this pull request Feb 7, 2025
Similar to #57229, this commit ensures that
`Compiler.finish!` properly synchronizes the operations to set
`max_world` for cached `CodeInstance`s by holding the world counter
lock. Previously, `Compiler.finish!` relied on a narrow timing window to
avoid race conditions, which was not a robust approach in a concurrent
execution environment.

This change ensures that `Compiler.finish!` holds the appropriate lock
(via `jl_promote_ci_to_current`).
aviatesk added a commit that referenced this pull request Feb 8, 2025
Similar to #57229, this commit ensures that
`Compiler.finish!` properly synchronizes the operations to set
`max_world` for cached `CodeInstance`s by holding the world counter
lock. Previously, `Compiler.finish!` relied on a narrow timing window to
avoid race conditions, which was not a robust approach in a concurrent
execution environment.

This change ensures that `Compiler.finish!` holds the appropriate lock
(via `jl_promote_ci_to_current`).
KristofferC pushed a commit that referenced this pull request Feb 11, 2025
Addresses review comment in
#57212 (comment).
The key is that the hand-off of responsibility for verification between
the loading code and the ordinary backedge mechanism happens under the
world counter lock to ensure synchronization.

(cherry picked from commit 34aceb5)
KristofferC pushed a commit that referenced this pull request Feb 11, 2025
Similar to #57229, this commit ensures that
`Compiler.finish!` properly synchronizes the operations to set
`max_world` for cached `CodeInstance`s by holding the world counter
lock. Previously, `Compiler.finish!` relied on a narrow timing window to
avoid race conditions, which was not a robust approach in a concurrent
execution environment.

This change ensures that `Compiler.finish!` holds the appropriate lock
(via `jl_promote_ci_to_current`).

(cherry picked from commit 4ebb50b)
KristofferC added a commit that referenced this pull request Feb 13, 2025
Backported PRs:
- [x] #57142 <!-- Add reference to time_ns in time -->
- [x] #57241 <!-- Handle `waitpid` race condition when `SIGCHLD` is set
to `SIG_IGN` -->
- [x] #57249 <!-- restore non-freebsd-unix fix for profiling -->
- [x] #57211 <!-- Ensure read/readavailable for BufferStream are
threadsafe -->
- [x] #57262 <!-- edit NEWS for v1.12 -->
- [x] #57226 <!-- cfunction: reimplement, as originally planned, for
reliable performance -->
- [x] #57253 <!-- bpart: Fully switch to partitioned semantics -->
- [x] #57273 <!-- fix "Right arrow autocompletes at line end"
implementation -->
- [x] #57280 <!-- dep: Update JuliaSyntax -->
- [x] #57229 <!-- staticdata: Close data race after backedge insertion
-->
- [x] #57298 <!-- Updating binding version to fix MMTk CI -->
- [x] #57248 <!-- improve concurrency safety for `Compiler.finish!` -->
- [x] #57312 <!-- Profile.print: de-focus sleeping frames as gray -->
- [x] #57289 <!-- Make `OncePerX` subtype `Function` -->
- [x] #57310 <!-- Make ptls allocations at least 128 byte aligned -->
- [x] #57311 <!-- Add a warning for auto-import of types -->
- [x] #57338 <!-- fix typo in Float32 random number generation -->
- [x] #57293 <!-- Fix getfield_tfunc when order or boundscheck is Vararg
-->
- [x] #57349 <!-- docs: fix-up world-age handling for META access -->
- [x] #57344 <!-- Add missing type asserts when taking the queue out of
the task struct -->
- [x] #57348 <!-- 🤖 [master] Bump the SparseArrays stdlib from 212981b
to 72c7cac -->
- [x] #55040 <!-- Allow macrocall as function sig -->
- [x] #57299 <!-- Add missing latestworld after parameterized type alias
-->
@KristofferC KristofferC removed the backport 1.12 Change should be backported to release-1.12 label Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This change fixes an existing bug multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants