-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New RFC: Collection Transmute #2756
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,152 @@ | ||
- Feature Name: collection-transmute | ||
- Start Date: 2019-09-05 | ||
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) | ||
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Add a `transmute(self)` method to `Vec`, `VecDeque` and `BinaryHeap`. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
The mentioned types cannot safely be transmuted by `mem::transmute`. Adding a | ||
method for that purpose will hopefully discourage users to try anyway. | ||
|
||
E.g. the following code is UB: | ||
|
||
```rust | ||
let x = vec![0u32; 2]; | ||
let y = unsafe { std::mem::transmute::<_, Vec<[u8; 4]>>(x) }; | ||
``` | ||
|
||
This is explained in the docs for [`Vec`], but with an unsound solution. The | ||
way to do this correctly turns out surprisingly tricky: | ||
|
||
```rust | ||
let x = vec![0u32; 2]; | ||
let y = unsafe { | ||
let y: &mut Vec<_> = &mut *ManuallyDrop::new(x); | ||
Vec::from_raw_parts(y.as_mut_ptr() as *mut [u8; 4], | ||
y.len(), | ||
y.capacity()) | ||
}; | ||
``` | ||
|
||
Though the code is not too large, there are a good number of things to get | ||
wrong – this solution was iteratively created by soundness-knowledgeable | ||
Rustaceans, with multiple wrong attempts. So this method seems like a good | ||
candidate for inclusion into `std`. | ||
|
||
This also applies to `VecDeque` and `BinaryHeap`, which are implemented in | ||
terms of `Vec`. | ||
|
||
[`Vec`]: https://doc.rust-lang.org/std/vec/struct.Vec.html | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
The types `std::vec::Vec`, `std::collections::VecDeque` and | ||
`std::collections::BinaryHeap` all get a new unsafe `transmute` method that | ||
takes `self` by value and returns a new `Vec`/`VecDeque`/`BinaryHeap` with a | ||
caller-chosen item type. | ||
|
||
The API might look like this (exemplified for `Vec`): | ||
|
||
```rust | ||
impl<T> Vec<T> { | ||
/// Transmute this `Vec` to a different item type | ||
/// | ||
/// # Safety | ||
/// | ||
/// Calling this function requires the target item type be compatible with | ||
/// `Self::Item` (see [`mem::transmute`]). | ||
/// | ||
/// # Examples | ||
/// | ||
/// transmute a `Vec` of 32-bit integers to their byte representations: | ||
/// | ||
/// ``` | ||
/// let x = vec![0u32; 5]; | ||
/// let y = unsafe { x.transmute::<[u8; 4]>() }; | ||
/// assert_eq!(5, y.len()); | ||
/// assert_eq!([0, 0, 0, 0], y[0]); | ||
/// ``` | ||
/// | ||
/// [`mem::transmute`]: ../../std/mem/fn.transmute.html | ||
unsafe fn transmute<I>(self) -> Vec<I> { | ||
.. | ||
} | ||
``` | ||
|
||
This would mean our example above would become: | ||
|
||
```rust | ||
let x = vec![0i32; 2]; | ||
let y = unsafe { x.transmute::<[u8; 4]>() }; | ||
``` | ||
|
||
The documentation of `mem::transmute` should link to the new methods. | ||
|
||
A clippy lint can catch offending calls to `mem::transmute` and suggest using | ||
the inherent `transmute` method where applicable. | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
The implementation for `Vec` copies the above solution. The methods for | ||
`VecDeque` and `BinaryHeap` use the `Vec` method on their data. | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
Adding a new method to `std` increases code size and needs to be maintained. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While this increases the amount of code in std::collections, these generic methods should not incur a code-size penalty if they are not used. |
||
However, this seems to be a minor inconvenience when compared to the safety | ||
benefit. | ||
|
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
As explained above, the method is useful, yet non-obvious. There were multiple | ||
wrong implementation attempts before that were not chosen for being unsound. | ||
|
||
We could do nothing, but this means users wanting to transmute the items of a | ||
`Vec` will be left without an obvious solution which will lead to unsound code | ||
if they get it wrong. | ||
|
||
We could document the correct solution instead of putting it into `std`. This | ||
would lead to worse ergonomics. | ||
|
||
We could create a trait to hold the `transmute` method. This would allow more | ||
generic usage, but might lead to worse ergonomics due to type inference | ||
uncertainty. | ||
|
||
It would even be possible to provide a default implementation using | ||
`mem::transmute`, but having a default implementation that might be unsound for | ||
some types is a footgun waiting to happen. | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
`mem::transmute` offers the same functionality for many other types. We have | ||
added similar methods to different types where useful, see the various | ||
iterator-like methods in `std`. @Shnatsel and @danielhenrymantilla came up | ||
with the solution together in a | ||
[clippy issue](https://github.com/rust-lang/rust-clippy/issues/4484). | ||
|
||
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
- are there other types that would benefit from such a method? What about | ||
`HashSet`, `HashMap` and `LinkedList`? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the argument against doing this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @KrishnaSannasi
We define transmute between two types to be semantically equivalent to a bitwise move. While reading the RFC it was unclear to me whether the semantics of the proposed methods are a bitwise move of the collection types, or of the elements themselves. For the types covered by the RFC, even if the semantics are to perform a bitwise move of the element types, we can optimize that and make the operations O(1) in time and space, so there isn't really a difference in practice. Whether we can do this for HashMap and HashSet would depend on what the semantics are. For example, if the semantics are to perform a bitwise transmute of each element, then we can just implement these transmutes as a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would expect these to be O(1), we could add other methods like |
||
- would it make sense to add implementations to types where `mem::transmute` is | ||
acceptable, to steer people away from the latter? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure what you mean by this. Could you elaborate what you mean by "types where There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean types where it isn't insta-UB like |
||
- this RFC does not deal with collection types in crates such as | ||
[`SmallVec`](https://docs.rs/smallvec), though it is likely an implementation | ||
in `std` might motivate the maintainers to include similar methods. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW I don't think this is an unresolved question. It is kind of up to those crates to keep up with the libstd collection API. It is normal that they will lag a bit behind, but for the APIs proposed here keeping up is only one PR away. |
||
|
||
# Future possibilities | ||
[future-possibilities]: #future-possibilities | ||
|
||
The author cannot think of anything not already outlined above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to just say something like:
instead of using words like "compatible" which might not have a clear meaning for everybody.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @rust-lang/wg-unsafe-code-guidelines