Add functions to safely transmute float to int #39271

est31 · 2017-01-24T17:23:36Z

The safe subset of Rust tries to be as powerful as possible. While it is very powerful already, its currently impossible to safely transmute integers to floats. While crates exist that provide a safe interface, most prominently the iee754 crate (which also inspired naming of the added functions), they themselves only use the unsafe mem::transmute function to accomplish this task.

Also, including an entire crate for just two lines of unsafe code seems quite wasteful.

That's why this PR adds functions to safely transmute integers to floats and vice versa, currently gated by the newly added float_bits_conv feature.

The functions added are no niche case. Not just ieee754 currently implements float to int transmutation via unsafe code but also the very popular byteorder crate. This functionality of byteorder is in turn used by higher level crates. I only give two examples out of many: chor and bincode.

One alternative would be to manually use functions like pow or multiplication by 1 to get a similar result, but they only work in the int -> float direction, and are not bit exact, and much slower (also, most likely the optimizer will never optimize it to a transmute because the conversion is not bit exact while the transmute is).

Tracking issue: #40470

rust-highfive · 2017-01-24T17:23:49Z

r? @BurntSushi

(rust_highfive has picked a reviewer for you, use r? to override)

BurntSushi · 2017-01-24T17:29:09Z

@alexcrichton I think we briefly talked about this at one point and you mentioned there were possible safety issues related to NaN? Could you elaborate on that here?

alexcrichton · 2017-01-24T17:40:48Z

My only worry here would be related to "signaling nans" where some bit patterns of nan may cause hardware traps in some floating point operations (I believe). IIRC we don't generate them normally, so I don't think we've brushed up against them yet (I may be wrong).

That being said I think it has to do with weird processor mode features so I'd be fine having a feature like this.

retep998 · 2017-01-24T18:54:03Z

Signaling NaNs are only a concern when going from int to float. So I'd be totally down with safe bitcasting from float to int, but I'm totally opposed to safe bitcasting from int to float.

BurntSushi · 2017-01-24T18:55:44Z

@retep998 Can you explain the issue in more detail for people unfamiliar with signaling NaNs and under what circumstances they violate safety?

retep998 · 2017-01-24T19:17:46Z

A signalling NaN is a floating point number with a certain bit pattern that causes the CPU to trap when attempting to perform any operations with it, which means it'll raise an interrupt or signal or exception (similar to what division by zero does). As a result, LLVM defines operations on signalling NaNs to be undefined behavior, so if you can create a signalling NaN in safe code then you can create undefined behavior which definitely violates Rust's rules. Being able to bitcast an int to a float allows you to get any bit pattern of float you desire, including signalling NaNs, hence it should not be allowed in safe code.

est31 · 2017-01-24T19:19:54Z

Hmm, what about making from_bits panic, or return Err(()) if it finds a sNAN?
Apparently its fairly easy to check for them: https://www.doc.ic.ac.uk/~eedwards/compsys/float/nan.html

petrochenkov · 2017-01-24T19:21:00Z

@BurntSushi
Signaling NaNs can be obtained by transmuting a piece of memory, but they are never produced by CPU itself (on x86/ARM at least).
When signaling NaN is used in arithmetic operation, then CPU exception named "Floating Point Invalid Operation" is raised (if it's enabled).
(This normally results in jumping to exception (aka signal) handler.)
Invalid Operation exceptions can be enabled or disabled by MXCSR.IM bit on x86 (with SSE), or x87_FPU_Control_Word.IM bit (with x87), or FPCR.IOE on ARMs.
It seems like these bits are all usually disabled by default in practice, however all these registers are available from "user mode".

In principle code like this

#![feature(asm)]

fn main() { unsafe {
    // Enable "Invalid Operation" exceptions (SSE is used)
    // MXCSR.IM = 0
    let mut mxcsr = 0u32;
    asm!("stmxcsr $0" : "=*m" (&mxcsr) ::: "volatile");
    println!("default  mxcsr {:#016b}", mxcsr);
    mxcsr &= !(1 << 7);
    println!("modified mxcsr {:#016b}", mxcsr);
    asm!("ldmxcsr $0" :: "m" (mxcsr) :: "volatile");

    let a: f64 = std::mem::transmute(0xFFF0000000000001u64);
    let b = a + 1.0;
    println!("b {}", b);
}}

should cause this "Invalid Operation" exception ~~, but something seems to be subtly wrong and the exception doesn't happen~~. (I never really worked with this in practice.)

under what circumstances they violate safety?

If segfault is bad enough to be considered unsafety, then sigfpe is bad too? ... I guess?

BurntSushi · 2017-01-24T19:26:07Z

Well, I don't think segfaults directly imply unsafety, they are just highly correlated, no? For example, if using a signaling NaN wasn't undefined behavior and always caused your program to abort, then I wouldn't consider that a violation of safety. (Similar to, say, a SIGILL.) But @retep998 says signaling NaNs can cause undefined behavior, and that's certainly unsafe. I can't find a reference for that though. Relevant links:

BurntSushi · 2017-01-24T19:26:59Z

Hmm, what about making from_bits panic, or return Err(()) if it finds a sNAN?

Making it panic seems like a fine solution to me (if signaling NaN's are indeed unsafe).

est31 · 2017-01-24T19:52:38Z

I think returning Err is better because the most likely use case is in deserialisation code, and here panicing on well crafted input is not a very good thing to do. You can still unwrap or expect if you want to panic in such a case, but if the code paniced instead of returning an Err, you'd have to duplicate checks.

frewsxcv · 2017-01-24T20:26:03Z

src/libstd/f32.rs

+    ///
+    /// Note that this function is distinct from casting.
+    ///
+    /// ```


Nit: The examples should be under an # Examples heading

alexcrichton · 2017-01-25T00:40:09Z

Looks good to me!

petrochenkov · 2017-02-02T21:33:42Z

What are the primary uses of floats transmuted from integers, by the way?

I can imagine two scenarios:

Serialization/Deserialization. In this case signaling NaNs should probably be preserved like anything else.
Creation of a float from a bit-pattern, for example a NaN with custom payload/syndrome. In this case signaling NaNs should probably be preserved as well.

BurntSushi · 2017-02-03T12:38:35Z

To restate my concerns here:

I haven't yet seen something concrete that says signalling NaNs are UB. This seems like a critical issue, since it determines whether these functions are safe or not. I think we should get a concrete answer on that before moving forward. (Even if the concrete answer is "there is no concrete answer.")
If it is UB and we return an error, then I don't like returning () as the error type because it is inflexible. I think it should at least be an opaque new type that impls std::error::Error.

valarauca · 2017-02-03T13:52:21Z

Duplicating my comment from reddit b/c I don't expect people to read reddit

C11 standard says the behavior of hard/soft NaN is defined by ISO/IEC 60559. [1]

The ISO/IEC 60559 states [2]

Whether a signaling NaN input causes a domain error is implementation-defined.

So I believe panicking on +NaN, -NaN, +Inf, and -Inf is fine behavior. It avoid a lot of corner cases and makes the ecosystem generally safer.

[1] ISO/IEC 9899/201X Standard: Link Section: 5.2.4.2.2 Bullet Point 3, annotation 22

[2] ISO/IEC 60559 Standard: Link Section 12 first-ish paragraph.

BurntSushi · 2017-02-03T13:54:31Z

@valarauca Can Rust diverge from C11 and make it well defined? (Is that even a coherent question?)

valarauca · 2017-02-03T13:55:22Z

@BurntSushi I think Rust diverging from C to make undefined behavior well defined is one of the core missions of Rust (?)

My recommendation is panicking is fine.

BurntSushi · 2017-02-03T13:58:46Z

Sure, I just don't know the details between "make it well defined" and "make sure this is consistent with how we're using LLVM."

petrochenkov · 2017-02-03T14:47:24Z

I've updated my signaling NaN example in #39271 (comment). Now it outputs

default  mxcsr 0b01111110000000
modified mxcsr 0b01111100000000
Floating point exception (core dumped)

on Linux.

A possible solution:

Make from_bits safe and make it return f32/f64 (always success).
Instead make enabling FP exceptions unsafe.
- On platforms with FP exceptions disabled by default everything is good - enabling them require either inline asm (unsafe) or calling some functions through FFI (unsafe).
- On platforms with FP exceptions enabled by default they can be disabled at program startup. It looks like there are no such platforms among ones currently supported by Rust. Citing cppreference:
  
  the state of floating-point trapping facility at the time of program startup ... is false on most modern systems. An exception would be a DEC Alpha program, where it is true if compiled without -ieee.

If FP exceptions are disabled, then signaling NaNs are converted into quiet NaNs by arithmetic operations, so this is completely safe. The behavior is controlled by IEEE 754-2008, not implementations.

valarauca · 2017-02-03T15:19:14Z

I just don't know the details between "make it well defined" and "make sure this is consistent with how we're using LLVM."

What ever is done will have to be encoded into the Rust standard library and fixing various bugs will be a continuous effort.

The LLVM doesn't handle NaN consistently. IR intrinsics just end up standing "what ever the platform would do" in more or less words. Rust will have to write it's own implementation of this, also it will change based on the platform. This is what C/C++ do their standard libraries. They've been trying to figure out the idiomatic way to do this for nearly a decade.

The real issue is platforms. For GPU's, ARMv7, ARMv8, and NEON subnormals are flushed to zero (which isn't IEEE754 or ISO/IEC 60559 conformant) so there really isn't any NaN handling to do if you copy the value between registers you've already lost it. x86/x87/x64 does something different, and apparently so does PPC (I wouldn't be surprised if MIPS and POWER8 did too).

The easiest way to do this is have a something like:

fn is_nan(&self) -> bool {
 *self != Nanf32 //or Nanf64 for 64bit
 }

This is what IEEE and ISO/IEC suggest.

NAN == NAN is always false. x == NAN is also always false. But x != NAN is only ever true for NAN.

This doesn't 100% work. If NAN value is copied between registers on call (on ARMv7, ARMv8, NEON, GPU's) it'll become zero. If -ffastmath is used via the LLVM this calculation maybe optimized out. Also FTZ or RTZ modes in x86/x87/x64 will behave like ARMv7, etc.

I guess the proper way to catch both Quiet and Signal NAN, IEEE and ISO/IEC encode Quiet/Signal NAN's the same, but IDFK all platforms do. @petrochenkov example seems to imply that x86/x87/x64 treats anything with an 0xFF or 0x7FF exponent as a NaN which is non-standard...

So a quick (untested) example:

const F32_NAN: u32 = 0x7F800000u32;
const F32_NAN_MASK: u32 = 0x7FFFFFFFu32;
unsafe fn is_signal_nan(x: *const f32) -> bool {
        let ptr: *const u32 = mem::transmute(x);
        let mut val: u32 = read_volatile(ptr);
        (val & F32_NAN_MASK) == F32_NAN
}
const F64NAN: 7FF0000000000000u64;
const F64_NAN_MASK: 7FFFFFFFFFFFFFFFu64;
unsafe fn is_signal_nan(x: *const f64) -> bool {
        let ptr: *const u64 = mem::transmute(x);
        let mut val: u64 = read_volatile(ptr);
        (val & F64_NAN_MASK) == F64NAN
}

The type cast/volatile read is to avoid the compiler/platform doing any magic even in -ffastmath mode. The bitmask in question will end up being platform specific (Most platforms are IEEE754 conformant so this should work).

This of course can still fail on some platforms is f32/f64 is pushed to the stack from register in order to do the volatile read the platform could preform the magic then. Even if a direct cast is used

unsafe fn is_nan(x: f32) -> bool {
    let val: u32 = mem::transmute(x);
    ....
}

One can hit a register to register copies and incur the same magic.

There is no idiomatic way to do this.

I think the best approach is the stdlib will have to implement a solid attempt at NAN checking, and this function will grow and get more specialized as bug reports come in. Even well tested implementation will likely have a lot of holes in it as different processor models/modes will have different semantics.

petrochenkov · 2017-02-03T15:23:50Z

[Comment is moved to https://github.com/rust-lang/rust/pull/39271#issuecomment-277264729]

alexcrichton · 2017-02-14T00:41:31Z

Discussed during libs triage today our conclusions were:

The current API seems fine but we should return a concrete (and opaque) error type instead of ()
Before stabilizing we should hammer out signaling nan semantics, but that doesn't need to block this PR itself

tshepang · 2017-03-13T02:00:23Z

src/libstd/f32.rs

+    ///
+    /// ```
+    /// #![feature(float_bits_conv)]
+    /// assert!((1f32).to_bits() != 1f32 as u32); // to_bits() is not casting!


you can use assert_ne here

est31 · 2017-04-18T00:57:25Z

Tidy is fixed now. Had to rebase. r? @BurntSushi

petrochenkov · 2017-04-18T06:59:35Z

@bors r=BurntSushi

bors · 2017-04-18T06:59:36Z

📌 Commit 0c14815 has been approved by BurntSushi

bors · 2017-04-18T06:59:44Z

⌛ Testing commit 0c14815 with merge 1ceb5ad...

bors · 2017-04-18T07:21:05Z

💔 Test failed - status-appveyor

arielb1 · 2017-04-18T09:00:07Z

Spurious failures on AppVeyor due to ar.exe errors #40546

@bors retry

bors · 2017-04-18T09:00:16Z

⌛ Testing commit 0c14815 with merge f55afe5...

bors · 2017-04-18T09:00:24Z

⌛ Testing commit 0c14815 with merge 0fdf2eb...

Add functions to safely transmute float to int The safe subset of Rust tries to be as powerful as possible. While it is very powerful already, its currently impossible to safely transmute integers to floats. While crates exist that provide a safe interface, most prominently the `iee754` crate (which also inspired naming of the added functions), they themselves only use the unsafe `mem::transmute` function to accomplish this task. Also, including an entire crate for just two lines of unsafe code seems quite wasteful. That's why this PR adds functions to safely transmute integers to floats and vice versa, currently gated by the newly added `float_bits_conv` feature. The functions added are no niche case. Not just `ieee754` [currently implements](https://github.com/huonw/ieee754/blob/master/src/lib.rs#L441) float to int transmutation via unsafe code but also the [very popular `byteorder` crate](https://github.com/BurntSushi/byteorder/blob/1.0.0/src/lib.rs#L258). This functionality of byteorder is in turn used by higher level crates. I only give two examples out of many: [chor](https://github.com/pyfisch/cbor/blob/a7363ea9aaf372e3d24b52414b5c76552ecc91c8/src/ser.rs#L227) and [bincode](https://github.com/TyOverby/bincode/blob/f06a4cfcb5b194e54d4997c200c75b88b6c3fba4/src/serde/reader.rs#L218). One alternative would be to manually use functions like pow or multiplication by 1 to get a similar result, but they only work in the int -> float direction, and are not bit exact, and much slower (also, most likely the optimizer will never optimize it to a transmute because the conversion is not bit exact while the transmute is). Tracking issue: #40470

bors · 2017-04-18T10:24:07Z

💔 Test failed - status-travis

TimNN · 2017-04-18T11:23:42Z

@bors retry

Spurious segfault in tests for musl #38618 (segfault during musl test)

bors · 2017-04-18T11:23:50Z

⌛ Testing commit 0c14815 with merge c398efc...

Add functions to safely transmute float to int The safe subset of Rust tries to be as powerful as possible. While it is very powerful already, its currently impossible to safely transmute integers to floats. While crates exist that provide a safe interface, most prominently the `iee754` crate (which also inspired naming of the added functions), they themselves only use the unsafe `mem::transmute` function to accomplish this task. Also, including an entire crate for just two lines of unsafe code seems quite wasteful. That's why this PR adds functions to safely transmute integers to floats and vice versa, currently gated by the newly added `float_bits_conv` feature. The functions added are no niche case. Not just `ieee754` [currently implements](https://github.com/huonw/ieee754/blob/master/src/lib.rs#L441) float to int transmutation via unsafe code but also the [very popular `byteorder` crate](https://github.com/BurntSushi/byteorder/blob/1.0.0/src/lib.rs#L258). This functionality of byteorder is in turn used by higher level crates. I only give two examples out of many: [chor](https://github.com/pyfisch/cbor/blob/a7363ea9aaf372e3d24b52414b5c76552ecc91c8/src/ser.rs#L227) and [bincode](https://github.com/TyOverby/bincode/blob/f06a4cfcb5b194e54d4997c200c75b88b6c3fba4/src/serde/reader.rs#L218). One alternative would be to manually use functions like pow or multiplication by 1 to get a similar result, but they only work in the int -> float direction, and are not bit exact, and much slower (also, most likely the optimizer will never optimize it to a transmute because the conversion is not bit exact while the transmute is). Tracking issue: #40470

bors · 2017-04-18T13:47:42Z

☀️ Test successful - status-appveyor, status-travis
Approved by: BurntSushi
Pushing c398efc to master...

@BurntSushi

Stabilize float_bits_conv for Rust 1.21 Stabilizes the `float_bits_conv` lib feature for the 1.20 release of Rust. I've initially implemented the feature in #39271 and later made PR #43025 to output quiet NaNs even on platforms with different encodings, which seems to have been the only unresolved issue of the API. Due to PR #43025 being only applied to master this stabilisation can't happen for Rust 1.19 through the usual "stabilisation on beta" system that is being done for library APIs. r? @BurntSushi closes #40470.

rust-highfive assigned BurntSushi Jan 24, 2017

BurntSushi added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jan 24, 2017

est31 force-pushed the add_float_bits branch from 6bf2273 to dc85965 Compare January 24, 2017 19:54

frewsxcv reviewed Jan 24, 2017

View reviewed changes

est31 mentioned this pull request Feb 11, 2017

f32, f64 missing from_le, from_be, to_le, to_be, swap_bytes implementation #39742

Closed

BurntSushi mentioned this pull request Mar 1, 2017

is read_f32/read_f64 unsafe? BurntSushi/byteorder#71

Closed

tshepang reviewed Mar 13, 2017

View reviewed changes

est31 mentioned this pull request Mar 13, 2017

Tracking issue for float_bits_conv feature #40470

Closed

Add float_bits_conv to unstable book

0c14815

est31 force-pushed the add_float_bits branch from 35b07d4 to 0c14815 Compare April 18, 2017 00:56

bors merged commit 0c14815 into rust-lang:master Apr 18, 2017

dongweigogo mentioned this pull request Apr 27, 2017

math: add Round golang/go#20100

Closed

brson mentioned this pull request May 6, 2017

Rust 1.18 regression - ieee754-0.2.1, from_bits #41793

Closed

arielb1 added the relnotes Marks issues that should be documented in the release notes of the next release. label May 18, 2017

NikVolf mentioned this pull request May 31, 2017

Avoid unsafe transmuting ints to floats paritytech/parity-wasm#18

Closed

rphmeier mentioned this pull request May 31, 2017

make unsafe usages more safe paritytech/parity-wasm#19

Merged

est31 mentioned this pull request Jul 4, 2017

Stabilize float_bits_conv for Rust 1.20 #43055

Merged

Enet4 mentioned this pull request Oct 8, 2017

Non-unsafe functions for PODs? nabijaczleweli/safe-transmute-rs#5

Closed

jrmuizel mentioned this pull request Nov 11, 2017

Deserializer masks out signaling nans servo/webrender#2027

Closed

matklad mentioned this pull request Nov 11, 2017

RFC: ArrayBuffer Views neon-bindings/rfcs#5

Merged

est31 mentioned this pull request Dec 22, 2017

Expose float from_bits and to_bits in libcore. #46931

Merged

Add functions to safely transmute float to int #39271

Add functions to safely transmute float to int #39271

Conversation

est31 commented Jan 24, 2017 • edited Loading

rust-highfive commented Jan 24, 2017

BurntSushi commented Jan 24, 2017

alexcrichton commented Jan 24, 2017

retep998 commented Jan 24, 2017

BurntSushi commented Jan 24, 2017

retep998 commented Jan 24, 2017

est31 commented Jan 24, 2017

petrochenkov commented Jan 24, 2017 • edited Loading

BurntSushi commented Jan 24, 2017 • edited Loading

BurntSushi commented Jan 24, 2017 • edited Loading

est31 commented Jan 24, 2017

frewsxcv Jan 24, 2017

Choose a reason for hiding this comment

alexcrichton commented Jan 25, 2017

petrochenkov commented Feb 2, 2017

BurntSushi commented Feb 3, 2017

valarauca commented Feb 3, 2017 • edited Loading

BurntSushi commented Feb 3, 2017

valarauca commented Feb 3, 2017 • edited Loading

BurntSushi commented Feb 3, 2017

petrochenkov commented Feb 3, 2017 • edited Loading

valarauca commented Feb 3, 2017

petrochenkov commented Feb 3, 2017 • edited Loading

alexcrichton commented Feb 14, 2017

tshepang Mar 13, 2017

Choose a reason for hiding this comment

est31 commented Apr 18, 2017

petrochenkov commented Apr 18, 2017

bors commented Apr 18, 2017

bors commented Apr 18, 2017

bors commented Apr 18, 2017

arielb1 commented Apr 18, 2017 • edited Loading

bors commented Apr 18, 2017

bors commented Apr 18, 2017

bors commented Apr 18, 2017

TimNN commented Apr 18, 2017

bors commented Apr 18, 2017

bors commented Apr 18, 2017

est31 commented Jan 24, 2017 •

edited

Loading

petrochenkov commented Jan 24, 2017 •

edited

Loading

BurntSushi commented Jan 24, 2017 •

edited

Loading

BurntSushi commented Jan 24, 2017 •

edited

Loading

valarauca commented Feb 3, 2017 •

edited

Loading

valarauca commented Feb 3, 2017 •

edited

Loading

petrochenkov commented Feb 3, 2017 •

edited

Loading

petrochenkov commented Feb 3, 2017 •

edited

Loading

arielb1 commented Apr 18, 2017 •

edited

Loading