-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add note that Vec::as_mut_ptr() does not materialize a reference to the internal buffer #113859
Add note that Vec::as_mut_ptr() does not materialize a reference to the internal buffer #113859
Conversation
r? @thomcc (rustbot has picked a reviewer for you, use r? to override) |
I think this needs some work on wording (CC @rust-lang/opsem) and needs approval by libs-api (for which I'll nominate it). I believe we have tests in the repo that already effectively guarantee this (they'd fail under miri if this were violated), but on the other hand some decision was made not to add equivalent methods to String in #97483 (although perhaps it seems that might have been premature?) |
That's a good question. We are still quite far away from deciding on an aliasing model so we have to state things somewhat indirectly. We could either talk about not having intermediate references as you did, or we could describe the effects of this: when calling |
I think the most troubling example in this space is: fn main() {
let mut s = &mut [0, 1];
unsafe {
std::ptr::copy_nonoverlapping(s.as_ptr(), s.as_mut_ptr().add(1), 1);
}
} This program is UB under Stacked Borrows and well-defined in Tree Borrows. I think that's a big win for Tree Borrows. The refactoring to make this work in Stacked Borrows that I encourage is: fn main() {
let mut s = &mut [0, 1];
let ptr = s.as_mut_ptr();
unsafe {
std::ptr::copy_nonoverlapping(ptr, ptr.add(1), 1);
}
} I think this really unfortunate because the type signature of Ralf pointed out in this example: #97483 (comment) that Tree Borrows doesn't fully get us out of the "You must hoist calls to What I would like to tell people is that inserting calls to these methods does not have any aliasing implications. Currently that is true of So I would be happiest if we had consistency between In short, I'm concerned that telling anyone about this property of I do not have any suggestions for how to adjust the docs to alleviate this concern in light of our current need to state things indirectly. |
Notably these methods were added to Vec in #61114, and the motivation was to avoid materializing a reference to the internal buffer. It makes sense that if these methods exist solely for this reason, to document that.
I'm not really convinced by this, mostly because it is already the case that you have to understand a frightening amount about aliasing to avoid footguns in unsafe code (and it's sadly unlikely that this will get much better in the future). |
there should be 😄
I agree, we're already in the situation of "if you're using raw pointers you need to understand this borrowing model that is still changing and know caveats about closed and open opsem discussions", having a semi-cryptic note about materializing references does not make things worse for people who are not fully up to speed, but for people who do understand these models it gives them clarity that a piece of code is Actually Correct. (and as I mentioned in the issue, our surface area of useful unsafe things to do that are Definitely Sound is woefully small) |
@saethlin so concretely, you are saying we should make this not UB? fn main() {
let s = &mut [0,1,2,3];
let ptr = s.as_mut_ptr();
unsafe { ptr.write(97) };
let ptr2 = s.as_mut_ptr();
unsafe { ptr2.write(0) };
unsafe { ptr.write(97) };
} I agree that would be nice. But it's not clear how to do that without allowing way more programs than we want to allow... I'll open a thread on Zulip. The key difference between this example (using arrays/slices) and the ones with Vec/String is that for slices, |
For the |
I think this is getting off topic for the discussion at hand. At the end of the day, these method exist for this purpose, so we might as well document that they serve that purpose. |
This was discussed in the libs-api meeting. The conclusion was positive, but that we need much better wording (perhaps a link to our existing documentation on provenance, and/or examples of what is allowed vs not). It will also need a libs-api FCP. |
Thanks! The wording I chose here was mostly as an example; do people have ideas as to what the wording should be? (There's already some discussion here) |
As a start, it could talk more about what this actually means for users: /// This method guarantees that when it is called multiple times without
/// the buffer being reallocated in the mean time, the returned pointer will
/// always be exactly the same, even for the purpose of the aliasing model.
/// That means the following is legal:
/// ```rust
/// let mut v = vec![0];
/// let ptr1 = v.as_mut_ptr();
/// ptr1.write(1);
/// let ptr2 = v.as_mut_ptr();
/// ptr2.write(2);
/// // Notably, the write to `ptr2` did *not* invalidate `ptr1`:
/// ptr1.write(3);
/// ```
|
…he internal buffer
845f633
to
778fdf2
Compare
library/alloc/src/vec/mod.rs
Outdated
/// let mut v = vec![0]; | ||
/// let ptr = v.as_ptr(); | ||
/// let x = ptr.read(); | ||
/// v[0] = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i tried to write a similar example for as_ptr
and wanted to avoid using as_mut_ptr()
, but I think this is also unsound, yes? @RalfJung
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say it is up to t-libs if they want to guarantee this -- basically this means that the pointer returned here was not derived via a shared reference. By analogy with addr_of!
, I think we want to allow this code. (Is it worth mentioning that analogy?)
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
986595a
to
21906f9
Compare
@RalfJung I think this is sufficiently clear without committing too much or explaining a thesis' worth of semantics, thoughts? |
This comment has been minimized.
This comment has been minimized.
I suppose this needs T-libs-api FCP and not T-libs, so I should reassign it. r? @m-ou-se |
@rust-lang/libs-api: We discussed this pull request in the library API meeting on July 25 and had no objections, other than improving the original wording with examples, which has been done. |
Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. This will be merged soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (cedbe5c): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 631.918s -> 630.078s (-0.29%) |
See discussion on thomcc/rust-typed-arena#62 and t-opsem
This method already does the correct thing here, but it is worth guaranteeing that it does so it can be used more freely in unsafe code without having to worry about potential Stacked/Tree Borrows violations. This moves one more unsafe usage pattern from the "very likely sound but technically not fully defined" box into "definitely sound", and currently our surface area of the latter is woefully small.
I'm not sure how best to word this, opening this PR as a way to start discussion.