Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ore: make halt! use _exit to avoid exit handler races #27705

Merged
merged 2 commits into from
Jun 18, 2024

Conversation

teskje
Copy link
Contributor

@teskje teskje commented Jun 17, 2024

This PR fixes a soundness hole introduced by multiple threads calling halt! at the same time, thereby invoking the thread-unsafe libc exit function concurrently, which would lead to use-after-frees and resulting segfaults. The fix is to call _exit instead, which is thread-safe.

We can consider reverting this once rust-lang/rust#126600 is resolved.

Motivation

  • This PR fixes a recognized bug.

Fixes MaterializeInc/database-issues#6528

Checklist

@teskje teskje changed the title ci: re-enable skipped release qualification test ore: make halt! use _exit to avoid exit handler races Jun 17, 2024
@teskje
Copy link
Contributor Author

teskje commented Jun 17, 2024

"Longer Zippy ClusterReplicas" still fails, but not with a segfault anymore. Instead it seems to run into a thread limit, which likely unrelated.

///
/// This function exists to avoid that all callers of `halt!` have to explicitly depend on `libc`.
pub fn halt() -> ! {
unsafe { libc::_exit(2) };
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO(@teskje) some documentation about the use of _exit here

Copy link
Contributor

@benesch benesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo your TODO to add a comment about _exit. Really nice find on this one! 🙇🏽

@teskje teskje marked this pull request as ready for review June 17, 2024 16:17
@teskje teskje requested review from aljoscha, danhhz, jkosh44 and a team as code owners June 17, 2024 16:17
@teskje
Copy link
Contributor Author

teskje commented Jun 17, 2024

@MaterializeInc/testing The second PR re-enables the "Longer Zippy ClusterReplicas" test in the Release Qualification pipeline. Given that it failed once with an unrelated error, should I drop that commit? If I do the "references to already closed issues" check might start triggering on this.

@jkosh44 jkosh44 removed request for a team and jkosh44 June 17, 2024 17:12
@nrainer-materialize
Copy link
Contributor

@MaterializeInc/testing The second PR re-enables the "Longer Zippy ClusterReplicas" test in the Release Qualification pipeline. Given that it failed once with an unrelated error, should I drop that commit? If I do the "references to already closed issues" check might start triggering on this.

Go ahead, feel free to re-enable that check. We can disable it again if we see that further causes make it fail.

@teskje teskje merged commit ac649aa into MaterializeInc:main Jun 18, 2024
192 of 197 checks passed
@teskje teskje deleted the fix-halt-race branch June 18, 2024 08:18
@teskje teskje mentioned this pull request Jun 18, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants