-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std::os::unix::process::CommandExt::before_exec should be unsafe #39575
Comments
Could you clarify the unsafety that can happen here? We chose to stabilize this function on the grounds that we couldn't come up with any memory safety reasons this would cause problems, just deadlocks and such. |
It's probably best to treat anything that is undefined according to POSIX as a memory safety violation. Once POSIX says that you cannot assume anything about the process's behavior, you need to assume that memory safety is violated as well. I'll try to give two examples why this facility is unsafe in practice. I have to argue based on the glibc implementation, but we can hopefully assume that it is a valid implementation, matching the POSIX requirements in play here. First, assume that you have a PRNG for use for cryptography written in Rust. This PRNG uses locks or thread-local variables to protect its internal state. It does not have fork protection because there is no way to implement that in Rust. This PRNG will return the same sequence of bytes in the For the second problem, we look at the implementation of recursive mutexes in
The attempt in step 6 will succeed because I think the second example at least is quite compelling. |
That'd do it! |
This is not marked as unsafe today and to do so would be backwards incompatible; however, a quick search on GitHub didn't seem to turn up any code that uses it. |
The libs team discussed this a few weeks ago at the last triage and the conclusion was that we'd prefer to see concrete evidence (e.g. a proof-of-concept) showing the memory unsafety here. At that point we'll consider remediations but for now we're fine just adding some documentation that "care should be taken" |
In general, anything involving files may not behave as expected. For example, a concrete memory safety issue would be a memory mapped allocator and assumes that the memory mapped file can't be accessed by multiple processes (e.g., by creating a file with the Other shenanigans include:
|
Not to denigrate theoretical concerns (which is to say, just because one hasn't yet demonstrated that this violates memory safety doesn't mean that we're going to reject this out-of-hand, and it doesn't mean that it won't eventually become a severe bug somehow), but I second Alex's desire to see a working proof-of-concept in code. |
Some prior art
Feels similar to me. |
I honestly don't have time to write a convincing POC but this breaks all wayland implementations (assuming memory corruption counts as a safety bug) because wayland clients and servers communicate by memory mapped files. Any attempt (accidental or otherwise) to update a wayland buffer from within a Those aren't quite the same:
Unlike those cases, this affects a fundamental rust concept: move/copy semantics. Without As |
@alexcrichton I like to see all soundness issues with an associated priority tag, does your comment above imply that this is considered P-low? Alternatively, if you don't consider this a soundness issue at all, would you like to remove the I-unsound label? |
Ah yes the libs team decided this was basically P-medium, so I will tag as such. |
Even if there were no way to trigger memory unsafety today, at any time a change to libpthreads or libc or the operating system in the future could cause memory unsafety where there was previously none. Accordingly, I don't think blocking the change on a PoC makes sense, and further I think requiring somebody to make a PoC before the change is made would be a waste of those people's time. Any time we trigger undefined behavior, memory unsafety should just be assumed. |
BTW have anyone considered that on macOS if an application is using libdispatch, it must not call any libdispatch code after fork? I don't have PoC now but I think it'd not be difficult to create. |
I concur with the arguments here that this is a different beast than |
With some OS help one can always cause memory unsafety: #![feature(getpid)]
fn hacky_ptr_write<T>(ptr: &T, value: u32) {
std::process::Command::new("gdb").arg("-batch").arg("-quiet")
.arg("-pid").arg(format!("{}", std::process::id())).arg("-ex")
.arg(format!("set {{unsigned long}}{:p}={}",ptr,value))
.output() .unwrap();
}
fn main() {
let q = &mut Box::new(55);
hacky_ptr_write(&q, 0);
*q = Box::new(44); // Segmentation fault
} But this does not mean |
The problem here is that the type-based assumption that a non-Copy type can't be copied is broken (even though the copies end up in different processes). Any code relying on this assumption is potentially incorrect and any unsafe code relying on it is potentially memory unsafe. However, given the "different process" constraint, this bug necessarily involves the operating system. Yes, you can usually shoot yourself in the foot using the OS†. However, you generally have to explicitly request this behavior. In this case, you can take two apparently safe and independent APIs (e.g., wayland and †It's actually harder than you might think as long as you aren't root. Most modern Linux distros, at least, don't allow non-root processes to debug (or access/modify the memory of) non-child processes. |
I expect there to be some Does the question boil down to "Should every
During |
Yes. It memory maps a shared buffer.
Not unless you were to move the shared mmap functionality from wayland into std. It's not really about where the code lives but what are valid assumptions and what are not (although any assumptions made by std are assumed to be valid).
Pretty much, yes. If we do say "this API is fine", we should also provide a fork function (the same way we made
From an outside perspective, the value is copied and:
|
Just what is typically needed. Consider a C program: #include <stdlib.h>
#include <unistd.h>
int main() {
int fd = open("myfile.txt", O_RDONLY);
void* mem = malloc(300);
if (fork()) {
close(fd); // first "drop" of fd
free(mem); // first "drop" of mem
} else {
close(fd); // second "drop" of fd
free(mem); // second "drop" of mem
}
} There are two Unsafety of Relationship between processes after What exactly happens in Wayland if evil peer (or even evil Wayland server) deliberately corrupts our buffer? Shound't Rust program not trust memory-mapped buffers that are writable from outside anyway? I expect storing pointers inside a memory-mapped file is |
What function? |
There's also this comment from the implementation of // Currently we try hard to ensure that the call to `.exec()` doesn't
// actually allocate any memory. While many platforms try to ensure that
// memory allocation works after a fork in a multithreaded process, it's
// been observed to be buggy and somewhat unreliable, so we do our best to
// just not do it at all! If this is true it seems pretty clear that this function should be considered unsafe. If it's not true then there's a whole lot of unnecessary and complex unsafe code in the standard library. |
So crater found 32 root regressions. Looks like if we decide we want to take a stanza for tracking ownership of external resources, we'd have to go the gradual-deprecation route. |
Thanks for gathering the data @RalfJung! This is a nominated issue for the next T-libs meeting where we can try to discuss and discover a new name here. Do others have suggestions for what to call this method alternatively? Some ideas I might have are:
|
Ralf couldn't participate on the 29th. |
My preference is to deprecate the method temporarily with: #[deprecated =
"This method will become unsafe because $reason.\
Check whether what you are doing is safe and then wrap the call in `unsafe { ... }`."] And then we can let say 3-6 months pass while we file fixing PRs; after that we make I don't mind having an extra method; but I would like |
This was discussed briefly with the libs team during today's triage, and there was no objection to moving forward with a deprecation. The leading contender for a name was Moving this issue forward largely just needs someone to champion the PR and implementation to push it through! |
Not my call to make but I really do not like the idea of just incrementing the count on methods. This made a lot of code in Python just very ugly to begin with. If you want to find some better names here are similar APIs in other languages: function with the same behavior in python is called |
If editions could be used to solve this in the future, that would be a huge benefit. (I realise this is difficult given that libstd has to only be compiled once, but perhaps there is a way to attach some metadata to the function signature so that the compiler can do the mapping). eg. an attribute like: #[rename("before_exec", edition=2018)]
fn before_exec2(...) |
@Diggsey ostensibly you could do that with "edition visibility", e.g. |
This makes sense for someone coming from today's Rust, but in a future where |
We can have "soft unsafe" operations which only produce a (deny by default) lint that states that one should be wrapping it in Implementing such a scheme in the unsafe checker is quite simple. I am volunteering to write a PR if such a change is desired (or if we just want to see how such a change would look). |
Like what happens when taking a reference of a field of a repr(packed) struct? |
I think there's definitely enough valid pushback to not call it @oli-obk I think I'd personally prefer to see a different function, because there's also the matter of creating a function pointer to this method which today is safe and creates a safe function pointer, but aftewards would need to ideally be safe and create an unsafe function pointer but would in practice have to unsafely create a safe function pointer |
The language team agreed that |
Speaking only for myself: I like |
I do feel like there is a meta question lurking here of unsafe composition and how we should manage it. For now, it seems good to err on the side of caution where possible, though, and avoid thorny questions -- but -- as has been amptly demonstrated here -- it's sort of hard to tell what the limits ought to be on what unsafe code can and cannot do when it comes to e.g. external resources. I wonder if at some point we're going to want to try and allow unsafely implemented libraries to declare more precisely the kinds of things they rely on other libraries not to do. |
deprecate before_exec in favor of unsafe pre_exec Fixes rust-lang#39575 As per the [lang team decision](rust-lang#39575 (comment)): > The language team agreed that before_exec should be unsafe, and leaves the details of a transition plan to the libs team. Cc @alexcrichton @rust-lang/libs how would you like to proceed?
deprecate before_exec in favor of unsafe pre_exec Fixes rust-lang#39575 As per the [lang team decision](rust-lang#39575 (comment)): > The language team agreed that before_exec should be unsafe, and leaves the details of a transition plan to the libs team. Cc @alexcrichton @rust-lang/libs how would you like to proceed?
deprecate before_exec in favor of unsafe pre_exec Fixes rust-lang#39575 As per the [lang team decision](rust-lang#39575 (comment)): > The language team agreed that before_exec should be unsafe, and leaves the details of a transition plan to the libs team. Cc @alexcrichton @rust-lang/libs how would you like to proceed?
Here is a simple example showing memory unsafety in the presence of #![feature(thread_id_value)]
use std::sync::atomic::{AtomicU64, AtomicPtr};
use std::sync::atomic::Ordering::SeqCst;
use std::{thread, ptr};
pub fn f() {
static I: AtomicU64 = AtomicU64::new(0);
static D: AtomicPtr<i32> = AtomicPtr::new(ptr::null_mut());
let data = Box::leak(Box::new(0));
let tid = thread::current().id().as_u64().get();
match I.load(SeqCst) {
0 => {
I.store(tid, SeqCst);
/*
* Assumption for safety: No call to `f` can have the same `tid` while we're
* in the critical section that spans this comment.
*/
D.store(data, SeqCst);
},
n if n == tid => {
/* D has been set to a valid pointer because of the assumption above */
let _v = unsafe { *D.load(SeqCst) };
},
_ => { },
}
} |
The before_exec method should be marked unsafe because after a fork from a multi-threaded process, POSIX allows only allows async-signal-safe functions to be called. Otherwise, the result is undefined behavior.
The only way to enforce this in Rust is to mark the function as unsafe.
So to be clear, this should not compile:
The text was updated successfully, but these errors were encountered: