Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more docs - mostly warnings - to std::mem::transmute #34609

Merged
merged 15 commits into from
Jul 27, 2016
186 changes: 182 additions & 4 deletions src/libcore/intrinsics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -278,21 +278,199 @@ extern "rust-intrinsic" {
/// Moves a value out of scope without running drop glue.
pub fn forget<T>(_: T) -> ();

/// Unsafely transforms a value of one type into a value of another type.
/// Reinterprets the bits of a value of one type as another type. Both types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first paragraph needs to be a single sentence as it's used as a short summary on the module page and search results.

/// must have the same size. Neither the original, nor the result, may be an
/// [invalid value]
/// (https://doc.rust-lang.org/nomicon/meet-safe-and-unsafe.html).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link should be ../../nomicon/meet-safe-and-unsafe.html so it works in the offline docs.

///
/// Both types must have the same size.
/// `transmute::<T, U>(t)` is semantically equivalent to the following:
///
/// ```
/// use std::{mem, ptr};
/// // assuming that T and U are the same size
/// unsafe fn transmute<T, U>(t: T) -> U {
/// let mut u: U = mem::uninitialized();
/// ptr::copy_nonoverlapping(&t as *const T as *const u8,
/// &mut u as *mut U as *mut u8,
/// mem::size_of::<T>());
/// mem::forget(t);
/// u
/// }
/// ```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that this would be more useful to most users if explained intuitively, rather than by giving code.

Copy link
Contributor Author

@strega-nil strega-nil Jul 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does one "explain intuitively" this concept. For comparison, "Unsafely transforms" is what the previous docs had. Many people think it does a reinterpret_cast. I think it's necessary to have a real code example, in order to tell one what's actually happening. (I'm not just being thick, if you have a suggestion, please make it. I'm actually not sure how to explain it intuitively).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like "transmute returns a bitwise copy of the input. Before returning, the input's memory is freed without calling any destructor or otherwise tidying up"

///
/// `transmute` is incredibly unsafe. There are a vast number of ways to
/// cause undefined behavior with this function. `transmute` should be
/// the absolute last resort.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I object to the tone here, I think it will serve only to annoy readers, rather than warn them off. Rather than an implicit argument from authority, it would be better to say why transmute is unsafe and how it can cause UB and why that is bad.

Copy link
Contributor Author

@strega-nil strega-nil Jul 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps. However, we already have the nomicon, and that is the point of it. We want to keep away people however possible. And, if you know what you're doing, you're not reading this blurb.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to have a sentence here which briefly describes why transmute is unsafe, and hopefully inspires the reader to go to the nomicon to find out more. We can't expect that every reader will also read the nomicon.

We want to keep away people however possible

I don't agree. We want to give users enough information to make an informed choice and to properly understand the trade-offs they are making with that choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess... I've just seen so many people use it poorly, making their Rust code less safe.

///
/// The [nomicon](https://doc.rust-lang.org/nomicon/transmutes.html) has
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ollie27 thank you for these :)

/// more complete documentation. Read it before using `transmute`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Read it before using transmute" is unnecessary and patronising

Copy link
Contributor Author

@strega-nil strega-nil Jul 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's unnecessary. You should read these docs before using transmute, transmute is one of the most Undefined Behavior-causing, misused parts of Rust. It's an advanced part of Rust, which should almost never be (mis)used by beginners.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but the fact that there is a reference to it here in the docs implies that you should read it, you don't need to spell it out too. I think it will make readers less likely to actually read the reference because they will be irritated.

///
/// # Alternatives
///
/// There are very few good cases for `transmute`. Most can be achieved
/// through other means. Some more or less common uses, and a better way,
/// are as follows:
///
/// Turning a pointer into a `usize`:
/// ```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put a blank line between the triple backticks and the descriptions? This applies throughout the file.

/// let ptr = &0;
/// let ptr_num_transmute = mem::transmute::<&i32, usize>(ptr);
/// // Use `as` casts instead
/// let ptr_num_cast = ptr as *const i32 as usize;
/// ```
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the chaining of as to "hop" between types. as is not transitive; it requires explicit conversions between each step. (or something, to give intuition as what in the world is going on here :-D )

/// Turning a `*mut T` into an `&mut T`:
/// ```
/// let ptr: *mut i32 = &mut 0;
/// let ref_transmuted = mem::transmute::<*mut i32, &mut i32>(ptr);
/// // Use reborrows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Use a reborrow"

/// let ref_casted = &mut *ptr;
/// ```
///
/// Turning an `&mut T` into an `&mut U`:
/// ```
/// let ptr = &mut 0;
/// let val_transmuted = mem::transmute::<&mut i32, &mut u32>(ptr);
/// // Now let's put together `as` and reborrowing
/// let val_casts = &mut *(ptr as *mut i32 as *mut u32);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me why this is preferable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transmute is too powerful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not stated anywhere in these docs. I also am still not clear on exactly what that means, or why transmute's ability to do other things is a problem when someone wants to turn a &mut i32 to an &mut u32.

&mut *(ptr as *mut i32 as *mut u32) seems super symbol-soupy to me, whereas transmute::<&mut i32, &mut u32>(ptr) makes it very clear what's going on. Nothing in these docs give me any indication whatsoever as to why I should go with the soup, just that I should.

Copy link
Contributor Author

@strega-nil strega-nil Jul 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't. It doesn't make it clear at all that it's a bitcast between &mut i32 and &mut u32. A semantic memcpy is happening.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is clear to everyone that knows what transmute is. If you don't know what transmute is it may not be clear, just as it may not be clear what the casting implementation did if you didn't know what raw pointers were in Rust, or that &mut * does not actually involve a read of the underlying value and a copy into a new storage location.

/// ```
///
/// Turning an `&str` into an `&[u8]`:
/// ```
/// // this is not a good way to do this.
/// let slice = unsafe { mem::transmute::<&str, &[u8]>("Rust") };
/// assert_eq!(slice, [82, 117, 115, 116]);
/// // You could use `str::as_bytes`
/// let slice = "Rust".as_bytes();
/// assert_eq!(slice, [82, 117, 115, 116]);
/// // Or, just use a byte string, if you have control over the string
/// // literal
/// assert_eq!(b"Rust", [82, 117, 116, 116]);
/// ```
///
/// Turning a `Vec<&T>` into a `Vec<Option<&T>>`:
/// ```
/// let store = [0, 1, 2, 3];
/// let v_orig = store.iter().collect::<Vec<&i32>>();
/// // Using transmute: this is Undefined Behavior, and a bad idea
/// // However, it is no-copy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a period

/// let v_transmuted = mem::transmute::<Vec<&i32>, Vec<Option<&i32>>>(
/// v_orig.clone());
/// // This is the suggested, safe way
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here, and the line below

/// // It does copy the entire Vector, though, into a new array
/// let v_collected = v_orig.clone()
/// .into_iter()
/// .map(|r| Some(r))
/// .collect::<Vec<Option<&i32>>>();
/// // The no-copy, unsafe way, still using transmute, but not UB
/// // This is equivalent to the original, but safer, and reuses the
/// // same Vec internals. Therefore the new inner type must have the
/// // exact same size, and the same or lesser alignment, as the old
/// // type. The same caveats exist for this method as transmute, for
/// // the original inner type (`&i32`) to the converted inner type
/// // (`Option<&i32>`), so read the nomicon page linked above.
/// let v_from_raw = Vec::from_raw_parts(v_orig.as_mut_ptr(),
/// v_orig.len(),
/// v_orig.capacity());
/// mem::forget(v_orig);
/// ```
///
/// Implemententing `split_at_mut`:
/// ```
/// use std::{slice, mem};
/// // There are multiple ways to do this; and there are multiple problems
/// // with the following, transmute, way
/// fn split_at_mut_transmute<T>(slice: &mut [T], index: usize)
/// -> (&mut [T], &mut [T]) {
/// let len = slice.len();
/// assert!(index < len);
/// unsafe {
/// let slice2 = mem::transmute::<&mut [T], &mut [T]>(slice);
/// // first: transmute is not typesafe; all it checks is that T and
/// // U are of the same size. Second, right here, you have two
/// // mutable references pointing to the same memory
/// (&mut slice[0..index], &mut slice2[index..len])
/// }
/// }
/// // This gets rid of the typesafety problems; `&mut *` will *only* give
/// // you an &mut T from an &mut T or *mut T
/// fn split_at_mut_casts<T>(slice: &mut [T], index: usize)
/// -> (&mut [T], &mut [T]) {
/// let len = slice.len();
/// assert!(index < len);
/// unsafe {
/// let slice2 = &mut *(slice as *mut [T]);
/// // however, you still have two mutable references pointing to
/// // the same memory
/// (&mut slice[0..index], &mut slice2[index..len])
/// }
/// }
/// // This is how the standard library does it. This is the best method, if
/// // you need to do something like this
/// fn split_at_stdlib<T>(slice: &mut [T], index: usize)
/// -> (&mut [T], &mut [T]) {
/// let len = self.len();
/// let ptr = self.as_mut_ptr();
/// unsafe {
/// assert!(mid <= len);
/// // This now has three mutable references pointing at the same
/// // memory. `slice`, the rvalue ret.0, and the rvalue ret.1.
/// // However, `slice` is never used after `let ptr = ...`, and so
/// // one can treat it as "dead", and therefore, you only have two
/// // real mutable slices.
/// (slice::from_raw_parts_mut(ptr, mid),
/// slice::from_raw_parts_mut(ptr.offset(mid as isize), len - mid))
/// }
/// }
/// ```
///
/// # Examples
///
/// There are valid uses of transmute, though they are few and far between.
///
/// Getting the bitpattern of a floating point type:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a valid use case and not up above under the "just type pun with raw pointers" examples?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the above aren't type punning with raw pointers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is *(&1.0 as *const f32 as *const u32) a Bad Thing, or just an alternative to this?

Copy link
Contributor Author

@strega-nil strega-nil Jul 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a Bad Thing. What I'm talking about isn't type punning through raw pointers, it's doing:

let x: u32 = std::mem::transmute::<i32, u32>(-1);

instead of

let x: u32 = -1i32 as u32;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that we're not dealing with pointers here, but again, the docs do not explain anywhere as to why that's as important of a distinction as it apparently is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transmute is incredibly unsafe. There are a vast number of ways to
cause undefined behavior with this function. transmute should be
the absolute last resort.

I'm not sure what else you want. I also link to the nomicon page on it. transmute is generally agreed to be a Bad Idea. These aren't meant to be the be-all end-all docs; these are meant to stop you in your tracks, and show you what you should be using instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all they're meant to do is stop you in your tracks we don't need 400 lines of examples. An <h1><blink><marquee>DON'T USE THIS UNLESS YOU KNOW WHAT YOU'RE DOING</marquee></blink></h1> would serve that purpose.

Copy link
Contributor Author

@strega-nil strega-nil Jul 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfackler See #34609 (comment). These aren't meant to say "Don't use transmute ever". They're meant to say "transmute is dangerous, almost never use it".

/// ```
/// let bitpattern = std::mem::transmute::<f32, u32>(1.0);
/// assert_eq!(bitpattern, 0x3F800000);
/// ```
///
/// Turning a pointer into a function pointer (this isn't guaranteed to
/// work in Rust, although, for example, Linux does make this guarantee):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not guaranteed, why are we recommending it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's useful on some platforms (like POSIX and Windows :P)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we should describe it correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you mean?

/// ```
/// fn foo() -> i32 {
/// 0
/// }
/// let pointer = foo as *const ();
/// let function = std::mem::transmute::<*const (), fn() -> i32>(pointer)
/// assert_eq!(function(), 0);
/// ```
///
/// Extending a lifetime, or shortening an invariant an invariant lifetime;
/// this is advanced, very unsafe rust:
/// ```
/// use std::mem;
///
/// let array: &[u8] = unsafe { mem::transmute("Rust") };
/// assert_eq!(array, [82, 117, 115, 116]);
/// struct R<'a>(&'a i32);
/// unsafe fn extend_lifetime<'b>(r: R<'b>) -> R<'static> {
/// mem::transmute::<R<'b>, R<'static>>(ptr);
/// }
///
/// unsafe fn shorten_invariant<'b, 'c>(r: &'b mut R<'static>)
/// -> &'b R<'c> {
/// let ref_to_original =
/// mem::transmute::<&'b mut R<'static>, &'b mut R<'c>>(
/// ref_to_extended);
/// }
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
pub fn transmute<T, U>(e: T) -> U;

/// Gives the address for the return value of the enclosing function.
///
/// Using this intrinsic in a function that does not use an out pointer
/// will trigger a compiler error.
pub fn return_address() -> *const u8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this change intentional?

Copy link
Contributor Author

@strega-nil strega-nil Jul 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ermm... no, sorry


/// Returns `true` if the actual type given as `T` requires drop
/// glue; returns `false` if the actual type provided for `T`
/// implements `Copy`.
Expand Down