Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unix PAL POSIX Mutex lock should check its return codes for soundness #120147

Closed
cbiffle opened this issue Jan 19, 2024 · 2 comments · Fixed by #120238
Closed

Unix PAL POSIX Mutex lock should check its return codes for soundness #120147

cbiffle opened this issue Jan 19, 2024 · 2 comments · Fixed by #120238
Labels
A-thread Area: `std::thread` C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-unix Operating system: Unix-like T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@cbiffle
Copy link
Contributor

cbiffle commented Jan 19, 2024

Hi! We've run into a subtle Mutex bug on a POSIX platform, and it's led me to propose a small change to the pthread_mutex-based PAL code.

Shout-out to @papertigers for noticing this and coming up with a reduction test case.

Details and background

The current lock routine, for instance, for the generic non-Futex Unix PAL is here:

let r = libc::pthread_mutex_lock(raw(self));
debug_assert_eq!(r, 0);

Since we're in std, that debug_assert_eq will be compiled out in almost all cases. This presents a soundness problem on non-Futex POSIX platforms where a non-error-checked pthread_mutex_lock call may fail. Since the PAL is specifying a PTHREAD_MUTEX_NORMAL (a non-error-checked mutex), you might expect lock to be unable to fail.

However, POSIX platforms in the wild have interpreted pthread_mutex_lock's semantics more loosely, and provide deadlock detection on lock for non-checked mutexes. This specifically affects versions of Solaris dating back to before the open source release (the point at which my ability to follow history ends), and various Solaris-descended operating systems, including Illumos (where this was noticed).

This means that on Solaris, Illumos, and kin, the current Rust Mutex implementation is unsound, because it can be used to generate aliasing &muts:

let m1 = Mutex::new();

let mut lock1 = m1.lock().unwrap();
// On a strict POSIX compliant platform this next line will hang.
// On Solaris-derived kernels it will instead succeed.
let mut lock2 = m1.lock().unwrap();

*lock1 = 42;
*lock2 = 0;

// We're well into UB land here, making these statements
// difficult to predict:
println!("{}", *lock1);
println!("{}", *lock2);

To be clear -- my interpretation of the POSIX standard says that these platforms are violating it. We have filed a bug reflecting this against Illumos, so that with any luck some future versions will behave differently. But there are a number of systems in the wild that may never see that fix, and it'd be good (IMO) to have Mutex be correct there too. I look forward to the world where I only have to write code for correct operating systems, but we're not there yet. :-)

Proposed change

Because the correctness of the POSIX lock operation is critical for soundness in Rust, I propose that the PAL check the return code from pthread_mutex_lock, under the banner of "wouldn't you want to know if this assumption were violated?"

Concretely, I'd like to switch that check to a full-fledged assert rather than a debug_assert. I'm writing this bug to open a discussion and see what folks would want to see before this happens, or if anyone has a good alternative. For instance, would folks feel strongly about this change having a platform-specific cfg to enable it, assuming that only a named subset of POSIX platforms will ever use this interpretation of the standard? (I'd be more comfortable opting specific POSIX platforms out of the check, personally, or making it unconditional -- but let's talk!)

As far as I can tell, the posix PAL is used primarily, in terms of numbers, by Darwin. Linux and the common BSDs do not rely on it. Darwin's mutex does not behave the same way as the Solaris-descended ones, so the soundness issue does not affect Darwin. As a result I could imagine opting Darwin out of the test, if folks are concerned about a performance hit.

I'm not sure if the current platform support has any kind of shared set of "POSIX interpretation quirks" that generates features I could use instead of target_os; if that's a thing, point me at it.

Thanks in advance!

@cbiffle cbiffle added the C-bug Category: This is a bug. label Jan 19, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jan 19, 2024
@ChrisDenton
Copy link
Member

If it causes unsoundness then I think the best thing to do is to have an allow list of known good systems and assume the worst for everything else. The list of "bad" targets is potentially unbounded but we know which targets we've checked and are ok.

@cbiffle
Copy link
Contributor Author

cbiffle commented Jan 19, 2024

That'd be my preference as well, @ChrisDenton.

@Noratrieb Noratrieb added I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-libs Relevant to the library team, which will review and decide on the PR/issue. O-unix Operating system: Unix-like A-thread Area: `std::thread` and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Jan 20, 2024
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jan 20, 2024
@apiraino apiraino removed the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jan 23, 2024
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 28, 2024
…, r=Mark-Simulacrum

Always check the result of `pthread_mutex_lock`

Fixes rust-lang#120147.

Instead of manually adding a list of "good" platforms, I've simply made the check unconditional. pthread's mutex is already quite slow on most platforms, so one single well-predictable branch shouldn't hurt performance too much.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 28, 2024
…, r=Mark-Simulacrum

Always check the result of `pthread_mutex_lock`

Fixes rust-lang#120147.

Instead of manually adding a list of "good" platforms, I've simply made the check unconditional. pthread's mutex is already quite slow on most platforms, so one single well-predictable branch shouldn't hurt performance too much.
bors added a commit to rust-lang-ci/rust that referenced this issue Jan 29, 2024
…r=Mark-Simulacrum

Always check the result of `pthread_mutex_lock`

Fixes rust-lang#120147.

Instead of manually adding a list of "good" platforms, I've simply made the check unconditional. pthread's mutex is already quite slow on most platforms, so one single well-predictable branch shouldn't hurt performance too much.
bors added a commit to rust-lang-ci/rust that referenced this issue Jan 29, 2024
…r=Mark-Simulacrum

Always check the result of `pthread_mutex_lock`

Fixes rust-lang#120147.

Instead of manually adding a list of "good" platforms, I've simply made the check unconditional. pthread's mutex is already quite slow on most platforms, so one single well-predictable branch shouldn't hurt performance too much.
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 5, 2024
…r=Mark-Simulacrum

Always check the result of `pthread_mutex_lock`

Fixes rust-lang#120147.

Instead of manually adding a list of "good" platforms, I've simply made the check unconditional. pthread's mutex is already quite slow on most platforms, so one single well-predictable branch shouldn't hurt performance too much.
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 8, 2024
…r=Mark-Simulacrum

Always check the result of `pthread_mutex_lock`

Fixes rust-lang#120147.

Instead of manually adding a list of "good" platforms, I've simply made the check unconditional. pthread's mutex is already quite slow on most platforms, so one single well-predictable branch shouldn't hurt performance too much.
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 9, 2024
…r=Mark-Simulacrum

Always check the result of `pthread_mutex_lock`

Fixes rust-lang#120147.

Instead of manually adding a list of "good" platforms, I've simply made the check unconditional. pthread's mutex is already quite slow on most platforms, so one single well-predictable branch shouldn't hurt performance too much.
@bors bors closed this as completed in 972452c Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-thread Area: `std::thread` C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-unix Operating system: Unix-like T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants