-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Convention for constructing lifetime-bound values from raw pointers #556
Conversation
fn inner_str(&self) -> &[u8] { | ||
unsafe { | ||
let p = ffi::outer_get_inner_str(&self.raw); | ||
let s = std::slice::from_raw_buf(p, libc::strlen(p)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the indentation on this line is a tiny bit off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks.
Unfortunately Here's an example from real code in impl<T: Send + Sync> Deref<T> for Thunk<T> {
fn deref(&self) -> &T {
self.force();
match *self.inner {
// We can use `copy_lifetime` here because the `val` borrow
// doesn't prevent us using `self`.
Evaluated(ref val) => unsafe { mem::copy_lifetime(self, val) },
// We just forced this thunk.
_ => unsafe { debug_unreachable!() }
}
}
}
impl<T: Send + Sync> DerefMut<T> for Thunk<T> {
fn deref_mut(&mut self) -> &mut T {
self.force();
match *&mut*self.inner {
// We can't use copy_mut_lifetime here because self is already
// borrowed as &mut by val.
//
// Changing it to only require an `&` reference wouldn't change this.
Evaluated(ref mut val) => unsafe { mem::transmute(val) },
// We just forced this thunk.
_ => unsafe { debug_unreachable!() }
}
}
} Changing the copy-from parameter to a non-mutable reference doesn't help these cases. |
@reem True, but:
Here's an example from grust (which has served as a testbed for this proposal): pub unsafe fn from_raw_mut<'a, T, Sized? U>(ptr: *mut <T as Wrapper>::Raw,
_life_anchor: &'a U)
-> &'a mut T
where T: Wrapper
{
// could be mem::copy_mut_lifetime(_life_anchor, &*(ptr as *mut T))
mem::transmute(&*(ptr as *mut T))
} |
Instead of obtaining the lifetime from a raw pointer, provide it explicitly in the manner of rust-lang/rfcs#556.
@mzabaluev another alternative you may wish to mention is to remove the "lifetime anchor" altogether. I've had mixed experiences in libraries trying to require a lifetime anchor here and there. Sometimes it works out great but other times it ends up just being a pain most of the time. Along the same lines, do you think it's worth having the anchors (in some fashion, not necessarily the current fashion) over not having any anchors at all? |
@alexcrichton I should put that in. But pretty much the only context where such functions - no input lifetime parameters, but a lifetime on the return value - can be used without explicit lifetime annotation is return values, isn't it? That may be more pain than the need to use an anchor; in effect, for that to be usable you'd often need a wrapper function to supply the anchor anyway. In my experience, the anchor in many cases will be Can you point at some negative examples? Also, why it was decided to pass the raw pointers by reference in the first place? |
At the behest of Alex Crichton.
The signature I was thinking of was: fn c_str_to_bytes<'a>(ptr: *const c_char) -> &'a [u8] { ... } Where in this case
One example I have personally looks like: struct Foo<'a> { ptr: *mut T, lt: marker::ContravariantLifetime<'a> }
impl<'a> Foo<'a> {
fn bar(&self) -> &'a int {
unsafe {
let ptr = &(*self.ptr).bar;
// how to create promote `ptr` to the `'a` lifetime?
}
}
} Basically the lifetime is injected via some marker and there's not actual reference with the lifetime I want to return a value from. I do realize though that this is probably rare.
Initially these functions looked like the example I have above where the lifetime wasn't tied to anything at all and we were thinking that to help curb at least some of the unsafety that we would tie it to something. We didn't necessarily forsee some of the pitfalls you've outlined in this RFC, however. |
Then that lifetime would have to be explicitly annotated, unless it can be inferred from e.g. a return value type when the call is in such context. I have described that as an alternative, with the drawbacks outlined. In your example, could self be annotated with lifetime impl<'a> Foo<'a> {
fn bar(&'a self) -> &'a int {
unsafe {
let ptr = &(*self.ptr).bar;
std::mem::copy_lifetime(self, &*ptr)
}
}
} |
It does not actually require explicit annotation, unlink a generic the compiler will not throw an error if it cannot infer a lifetime, it will just default to I have personally found that in 99% of cases that the actual lifetime is being driven by a return value (as the unsafety is normally contained to a small method), which is the source of me questioning the need for an anchor at all.
Not quite as this means that |
@alexcrichton: Indeed, I underestimated Rust's type inference. The problem with unanchored return lifetime is that it has even worse potential for "convenient evil" than the current convention. Especially if names for such functions are also shorter.
In your example, a shorter lifetime on There may be less need for custom reference types if #709 is implemented: if |
I've been thinking about this recently and I'm not actually entirely convinced any more that it's so much more dangerous. In my example I gave above, I'm likely to end up using In other examples where you do have an anchor it may still be convenient to use I suppose what I've been thinking recently is that inference of the actual lifetime would largely be driven by signatures which should be carefully scrutinized regardless. I am a bit up in the air though, I haven't quite made up my mind! |
@alexcrichton The question is essentially about how much safety rope do we want to offer with these unsafe methods, above |
Ok, so to recap a bit, I think we have three routes proposed so far:
So with these three options, I'd like to take a look at what these functions are actually used for:
I believe that all three options above will satisfy the basic needs of these two pieces of functionality, and the only question is how they communicate the output lifetime. Given this, I want to try to write out the cons of each approach:
None of these seem to be a clear winner to me, and they all seem to have their footguns. I think that this may largely be a question of what footguns we want to buy into in some sense. I think I'm personally leaning towards option (3) because it provides the least amount of reason to use a raw |
@alexcrichton I'm becoming convinced, too, that option(3) is a valid approach, if different on the balance of usability vs safety. Changing from that to the status quo reduced usability, but fell short in providing useful safety semantics. The documentation can warn about possible pitfalls and suggest practices to avoid them. That solution invalidates the essential parts of this RFC. I can recycle the (so far) non-controversial bits into another proposal: revert to by-value input signatures and fix |
RFC PR rust-lang#556 is likely to get discarded.
I think that actually all changes still fall under the purview of this RFC, going by the title:
We're still developing the lifetime parameter conventions! Feel free to just re-word the RFC in-place, I'm totally fine with that! (including the |
I am strongly in favour of option 3 at this point in time. raw ptrs already deref to &'static, so if you're using raw ptrs you already have to deal with this problem. Having some APIs randomly try to force you to think about it is pointless book-keeping for someone who has opted into the problems of raw manipulation. In my experience, the only problem with lifetimes and raw types are when you store raw ptrs and forget to include a marker, and this should be solved by lints and #738. |
Rewritten the RFC to reflect the emerging consensus.
@alexcrichton I have rewritten the RFC, thanks a lot for your input. Now, assuming this is accepted, would you or anyone in the core team have time to implement it any time soon? This feels like something I could try to do myself in a branch, but as it probably involves much manual work, I wouldn't want the effort to be duplicated. |
Thanks @mzabaluev! I know I'd at least personally be up for implementing this if you're short on time, but you'd certainly have first dibs :) |
One thing I've never really been clear on: Is the signature
really any more expressive than
? As far as I can tell, the former basically always spits out a static lifetime, which usually just unifies to something reasonable at a later function boundary, or when put in something that expects a particular lifetime. I suppose one could manually specify |
@gankro The lifetime parameter may help catch errors when specified explicitly. Also semantically, I prefer the function signature saying "the lifetime is anything you want it to be" rather than "we say it's static, but you gotta remember it can be shorter than that". |
The emerging consensus in Rust [RFC PR #556][RFC] is to allow freely inferred lifetimes. [RFC]: rust-lang/rfcs#556
I've talked this over with the rest of the team, and we've got pretty broad support for this change, so I'm going to merge this now. Thanks again for writing this all up @mzabaluev! |
New functions, `slice::from_raw_parts` and `slice::from_raw_parts_mut`, are added to implement the lifetime convention as agreed in rust-lang/rfcs#556. The functions `slice::from_raw_buf` and `slice::from_raw_mut_buf` are left deprecated for the time being. Holding back on changing the signature of `std::ffi::c_str_to_bytes` as consensus in rust-lang/rfcs#592 is building to replace it with a composition of other functions. Contribution to rust-lang#21923.
New functions, `slice::from_raw_parts` and `slice::from_raw_parts_mut`, are added to implement the lifetime convention as agreed in rust-lang/rfcs#556. The functions `slice::from_raw_buf` and `slice::from_raw_mut_buf` are left deprecated for the time being. Holding back on changing the signature of `std::ffi::c_str_to_bytes` as consensus in rust-lang/rfcs#592 is building to replace it with a composition of other functions. Contribution to #21923.
Establish a convention throughout the core libraries for unsafe functions
constructing references out of raw pointers. The goal is to improve usability
while promoting awareness of possible pitfalls with inferred lifetimes.
Remove unnecessary mutability in
copy_mut_lifetime
while we are at it.Rendered
NOTE: much of the discussion below concerns earlier revisions where lifetime anchor parameters were proposed.