Miri float ops are less precise than what the standard library expects #4208

RalfJung · 2025-02-25T17:09:56Z

Tons of standard library float tests for ln, exp, and trigonometric functions fail with Miri's float non-determinism since they expect more precise results than what Miri delivers. I have temporarily disabled float non-determinism while we decide what to do.

A bunch of cases actually expect precise answers for certain cases, e.g.:
https://github.com/rust-lang/rust/blob/29166cd61711776e6d43239d6d18a0eafe6515b1/library/std/tests/floats/f64.rs#L468
https://github.com/rust-lang/rust/blob/29166cd61711776e6d43239d6d18a0eafe6515b1/library/std/tests/floats/f64.rs#L482-L483

Furthermore, there are quite a few cases where the standard library expects more precision than what Miri provides. assert_approx_eq! checks for a divergence of up to 1e-6, which is less than 10 ULP on a value of 1.0 represented as f32. Miri currently adds 16 ULP of error (on top of whatever error the host libm we use to implement these function has). Some of these tests chain multiple operations, such as 60.0f64.sinh().asinh(), so the error accumulates; some of these tests use values sufficiently larger than 1 that 16 ULP end up even exceeding 1e-5. Furthermore, some doctests actually check that the precision is within f32::EPSILON, which is 1 ULP on a value of 1.0; that is obviously never going to fly if Miri adds any kind of non-deterministic error.

I am open to reducing the error Miri applies, and happy for suggestions of what would make a good error scale. (I'd prefer it to be a power of 2, that makes the logic a lot simpler in Miri. ;) However, I also think some of these tests should change, specifically all tests that require either full equality or just 1 ULP error.

@tgross35 @Jules-Bertholet thoughts, opinions?

The text was updated successfully, but these errors were encountered:

Jules-Bertholet · 2025-02-25T17:24:13Z

IEEE 754 § 9.2.1 defines several special cases, including 0.0f64.exp(), where the result must be exact.

RalfJung · 2025-02-25T17:26:10Z

Does it also define 5.0f64.exp2()?

Jules-Bertholet · 2025-02-25T17:27:10Z

Does it also define 5.0f64.exp2()?

No.

RalfJung · 2025-02-25T17:28:45Z

So some of the assert_eq! are still making assumptions about unspecified precision, then.

Jules-Bertholet · 2025-02-25T17:49:38Z

Actually, scratch that. §9.2 says “a conforming operation shall return results correctly rounded for the applicable rounding direction for all operands in its domain”, which implies that technically, to be conforming an implementation can’t have any error ever. Of course, actual implementations fail this, which means they technically aren’t conforming.

I think that, beyond the special values explicitly called out in §9.2.1, some of this stuff (like 5.0f64.exp2() being exact) should be treated as a guarantee also, as no sane implementation is likely to break it.

RalfJung · 2025-02-25T18:00:46Z

§9.2 covers which operations?

Jules-Bertholet · 2025-02-25T18:04:43Z

All the optional mathematical operations. (The required math ops have a similar proviso)

RalfJung · 2025-02-25T18:06:59Z

Ah, lol. Does anyone actually do this right? Even @eduardosm's softfloat implementations had 1 ULP of error.

Well, the goal of this non-determinism is to expose bugs in code that incorrectly relies on precision assumptions. I'll happily take the advice of a domain expert for what is the best strategy for that. :)

Jules-Bertholet · 2025-02-25T18:12:12Z

The C standard is explicitly more forgiving, so maybe we should rely on it instead?

(Also, I’m not a domain expert, I just read the standard!)

tgross35 · 2025-02-26T02:59:41Z

Since we bind the C functions, we wind up following the same rules. Aiui this is extremely forgiving, sin could be a linear interpolation from a four-entry LUT. In practice, most system libm implementations are likely to be within ~2ULP for most ops, with a few exceptions (musl for example https://wiki.musl-libc.org/mathematical-library).

Representing this well in Miri is tricky because, while it's technically allowed, an implementation that can't produce $e^0=1$ would not be a very good one. For the tests in std that are failing, it feels like maybe we don't want the non-determinism? Since it seems like the goal would be to test the platform's real implementation, as opposed to ensuring that our tests pass on worst case platforms. I think the non-determinism makes more sense for downstream users that aren't just testing the routines themselves.

@beetrees may also have some ideas here.

Ah, lol. Does anyone actually do this right? Even @eduardosm's softfloat implementations had 1 ULP of error.

There are some implementations that provide the cr_* (correctly rounded) version of various routines, I am working on porting some of these from CORE-MATH to our libm. This isn't very common though.

RalfJung · 2025-02-26T06:44:07Z

FWIW results of 0 remain precise with the current scheme, since we add a relative error not an absolute one. (No idea what you meant by e^x = 0 though.)

RalfJung · 2025-02-26T06:44:51Z

Since it seems like the goal would be to test the platform's real implementation, as opposed to ensuring that our tests pass on worst case platforms.

Note that some of these are doctests and hence can be read as guarantees that the implementation will be within EPSILON of the expected answer.

tgross35 · 2025-02-26T07:31:26Z

FWIW results of 0 remain precise with the current scheme, since we add a relative error not an absolute one. (No idea what you meant by e^x = 0 though.)

Whoops, that should have been $e^0=1$.

Note that some of these are doctests and hence can be read as guarantees that the implementation will be within EPSILON of the expected answer.

I don't really like that we do this either, it builds the wrong intuition that EPSILON is always a reasonable value to compare float operations to. Assuming this comes from blending the definition of machine epsilon with the mathematical usage of $\epsilon$. It is nice that it looks clean, but it would probably be a good idea to change these tests anyway (being 1ULP > EPSILON for |x| > 2.0).

Is it possible to enable the randomness for most things but exclude tests in coretests / std/tests that are intended to validate system float ops?

RalfJung · 2025-02-26T08:47:46Z

Is it possible to enable the randomness for most things but exclude tests in coretests / std/tests that are intended to validate system float ops?

That's tricky. We could add a flag to disable that and just set that flag for the entire "run std tests in Miri" run, though.

But I want to first experiment with reducing the error Miri adds, e.g. to 4 ULP.

RalfJung · 2025-02-27T14:20:42Z

The C standard is explicitly more forgiving, so maybe we should rely on it instead?

Oh, that actually has some quite trivial points in it:

"For each single-argument function f in <math.h> whose mathematical counterpart is symmetric
(even), f(-x) is f(x) for all rounding modes and for all x in the (valid) domain of the function. For
each single-argument function f in <math.h> whose mathematical counterpart is antisymmetric
(odd), f(-x) is -f(x) for the ISO/IEC 60559 rounding modes roundTiesToEven, roundTiesToAway,
and roundTowardZero, and for all x in the (valid) domain of the function."

I think LLVM actually violates this, since the result is effectively non-deterministic (if LLVM evaluates libm calls at compile-time), and therefore cannot satisfy that requirement.

But then F.10.1-3 has a nice list of guarantees that are being made. I would add to that that functions with a clearly defined output range (such as [-1, 1] for sin, cos) must produce output in that range. And I'd suggest we reduce the error down to 4 ULP to avoid breaking so many of the std tests. Just having any non-determinism here should already help, I think.

@LorrensP-2158466 do you want to give this a try? Unfortunately I had to disable the code you added in your first PR as it made a bunch of test fail; turns out this is more complicated than I thought. I would say in a first PR, deal just with the ones defined in src/tools/miri/src/intrinsics/mod.rs.

LorrensP-2158466 · 2025-02-27T17:34:35Z

Yes, I'd like to give it a try, and just to be sure you want me to:

reduce the error to 4 ULP and change the approx_eq! macro to account for that
clamp output values to their defined ranges according to the standard

Am I missing something in this thread?

RalfJung · 2025-02-27T17:36:27Z

Yes, please also make sure the guarantees listed in F.10.1-3 of the C standard are met.

LorrensP-2158466 · 2025-02-27T17:38:09Z

Alright, will do!

tgross35 · 2025-02-27T18:34:53Z

My read of the standard was that F.10 paragraph 16 walks back the statement in paragraph 3:

Recommended practice

16 ISO/IEC 60559 specifies correct rounding for the operations in the F.3 table of operations recommended by ISO/IEC 60559, and thereby preserves useful mathematical properties such as symmetry, monotonicity, and periodicity. The corresponding functions with (potentially) reserved cr_-prefixed names (7.33.8) do the same. The C functions in the table, however, are not required to be correctly rounded, but implementations should still preserve as many of these useful mathematical properties as possible.

The "should" wording here seems to imply that the implementations are not required to uphold symmetry/monotonicity/periodicy. Likely not a problem, in any case.

RalfJung · 2025-02-27T18:47:39Z

What the heck, why would you have paragraph 16 directly contradict what paragraph 3 says?!?

Anyway, we'll just ignore both of them then. 🤷

Jules-Bertholet · 2025-02-27T18:57:19Z

I don’t see a contradiction. Paragraph 3 specifies a subset of IEEE 754 requirements that C also requires, paragraph 16 notes that this subset is a proper subset.

LorrensP-2158466 · 2025-02-27T19:36:30Z

How do you think I should best implement this? I'm currently looking at this piece of code:

  | "sinf32"
  | "cosf32"
  | "expf32"
  | "exp2f32"
  | "logf32"
  | "log10f32"
  | "log2f32"
  => {
      let [f] = check_intrinsic_arg_count(args)?;
      let f = this.read_scalar(f)?.to_f32()?;
      // Using host floats (but it's fine, these operations do not have
      // guaranteed precision).
      let host = f.to_host();
      let res = match intrinsic_name {
          "sinf32" => host.sin(),
          "cosf32" => host.cos(),
          "expf32" => host.exp(),
          "exp2f32" => host.exp2(),
          "logf32" => host.ln(),
          "log10f32" => host.log10(),
          "log2f32" => host.log2(),
          _ => bug!(),
      };
      let res = res.to_soft();
      // Apply a relative error of 16ULP to introduce some non-determinism
      // simulating imprecise implementations and optimizations.
      // FIXME: temporarily disabled as it breaks std tests.
      // let res = apply_random_float_error_ulp(
      //     this,
      //     res,
      //     4, // log2(16)
      // );
      let res = this.adjust_nan(res, &[f]);
      this.write_scalar(res, dest)?;
  }

For example sin and cos:
sin(x) and cos(x) are in [-1, 1] this can be done with .minimum(-1).maximum(1) ( or clamp(-1, 1)) which also propagates any NaN's and is thus correct. But sin(±0) = 0 and cos(±0) = 1, I'm not quite sure what the best way is to account for this, because 0 means 0 and not 4 ULP around 0. And that is not considering the other math functions.

Should I:

split these match arms into their categories (sin and cos, exps together, logs together)?
Use something that exists in miri and I don't know about?
Something else I'm not seeing?

RalfJung · 2025-02-28T10:40:03Z

You'll have to add a few helper functions, and some of the logic has to move inside the match intrinsic_name. You can't directly use clamp as that's a host float function and once the error has been applied, we shouldn't get back to host floats, so you'll have to implement a clamp on soft-floats.

To return specific values for specific inputs, we need a separate codepath that kicks in before f.to_host(). I would do it something like this:

let fixed_res = match intrinsic_name {
  "sinf32" if f == 0.0 => Some(0.0),
  "cosf32" if f == 0.0 => Some(1.0),
  ...
  _ => None
};
let res = fixed_res.unwrap_or_else(|| {
  let host = f.to_host();
  let res = match intrinsic_name { ... };
  let res = res.to_soft();
  let res = apply_random_float_error_ulp(...);
  // clamp
  match intrinsic_name {
    "sinf32" => res.clamp(-1.0, 1.0),
    ...
    _ => res,
  }
};
let res = this.adjust_nan(res, &[f]);
 this.write_scalar(res, dest)?;

except that the logic for fixed_res and clamp has to be done with soft-floats.

Note that "4 ULP relative error around 0" is still 0. But around 1 there could be some error.

LorrensP-2158466 · 2025-02-28T11:44:36Z

Alright, I was going in that direction but wasn't quite sure. Thanks for the great help!

@rustbot claim

LorrensP-2158466 · 2025-03-01T14:01:52Z

made a bunch of test fail; turns out this is more complicated than I thought. I would say in a first PR, deal just with the ones defined in src/tools/miri/src/intrinsics/mod.rs

Sorry, just to clarify: Do I have to make the changes in the rustc repo or this one?

RalfJung · 2025-03-03T11:40:21Z

It's probably better to do them in the rustc repo so that we can quick feedback whether the tests there still pass. But if you are having trouble setting that up, you can do them here if you want.

Over in the rustc repo, you can run Miri's own tests with ./x test miri -- float (where float is a filter, meaning you only run the tests that have float in their filename). You can run the relevant tests from the standard library with ./x miri core coretests -- f32 f64.

LorrensP-2158466 · 2025-03-05T11:07:01Z

So I implemented the logic and enabled the non-determinism again like so:

let res = fixed_float_value(intrinsic_name, f).unwrap_or_else(||{
    // Using host floats (but it's fine, these operations do not have
    // guaranteed precision).
    let host = f.to_host();
    let res = match intrinsic_name {
        "sinf32" => host.sin(),
        "cosf32" => host.cos(),
        "expf32" => host.exp(),
        "exp2f32" => host.exp2(),
        "logf32" => host.ln(),
        "log10f32" => host.log10(),
        "log2f32" => host.log2(),
        _ => bug!(),
    };
    let res = res.to_soft();

    // Apply a relative error of 4ULP to introduce some non-determinism
    // simulating imprecise implementations and optimizations.
    let res = apply_random_float_error_ulp(
        this,
        res,
        2, // log2(4)
    );

    match intrinsic_name {
        // sin and cos: [-1, 1]
        "sinf32" | "cosf32" => res.clamp(-1.0f32, 1.0f32), 
        // exp: [0, +INF]
        "expf32" | "exp2f32" => res.maximum(Single::ZERO), 
        _ => res,
    }
});
let res = this.adjust_nan(res, &[f]);
this.write_scalar(res, dest)?;

And I test it with this command: ./x miri --no-fail-fast core coretests std -- f32 f64, and to summarize:

integer decode tests fail because they do assert_eq!(2f32.powf(100.0).integer_decode(), (8388608, 77, 1)); but powf is non-deterministic so this will almost always fail. So, do you think I should remove this test and explain why?
The other tests fail where they expect a diff of 1e-6, but these fail even with a 4ULP relative error. Should I alter the approx_eq macro in the std tests to use the same one as in Miri?

I also added tests to the Miri float tests.
My Branch

beetrees · 2025-03-05T11:58:37Z

integer decode tests fail because they do assert_eq!(2f32.powf(100.0).integer_decode(), (8388608, 77, 1)); but powf is non-deterministic so this will almost always fail. So, do you think I should remove this test and explain why?

This test is just using powf as a convenience functions to generate the float to test - replacing 2f32.powf(100.0) with 1.2676506e30 and 2f64.powf(100.0) with 1.2676506002282294e30 should make the test past (probably leave a comment explaining that the constants are just 2¹⁰⁰).

The other tests fail where they expect a diff of 1e-6, but these fail even with a 4ULP relative error. Should I alter the approx_eq macro in the std tests to use the same one as in Miri?

Yes.

RalfJung · 2025-03-05T12:51:01Z

The other tests fail where they expect a diff of 1e-6, but these fail even with a 4ULP relative error. Should I alter the approx_eq macro in the std tests to use the same one as in Miri?

I would say just increase the error to 1e-5.

LorrensP-2158466 · 2025-03-05T12:51:39Z

Okey thanks!

But Just looking at the tests of std I noticed that using the the assert_approx_eq! macro of Miri requires a significant refactor. The 4 float testing modules (1 for each float type) rely heavily on the current macro which does EPSILON checking, it uses a lot of constants like:

TOL
TOL_IMPR or TOL_PREC
TOL_0, TOL_P2 (TOL_*)

Different tests use different constants and I don't know how I should change them.

LorrensP-2158466 · 2025-03-05T12:54:04Z

I would say just increase the error to 1e-5.

I thought about it but it still fails:

---- f32::test_acosh stdout ----

thread 'f32::test_acosh' panicked at library/std/tests/floats/f32.rs:642:5:
60.0 is not approximately equal to 60.000015 (threshold 1e-5, difference 1.5258789e-5)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
note: in Miri, you may have to set `MIRIFLAGS=-Zmiri-env-forward=RUST_BACKTRACE` for the environment variable to have an effect

---- f32::test_exp stdout ----

thread 'f32::test_exp' panicked at library/std/tests/floats/f32.rs:485:5:
148.41316 is not approximately equal to 148.4132 (threshold 1e-5, difference 3.0517578e-5)

RalfJung · 2025-03-05T12:55:16Z

I thought about it but it still fails:

Just make it 1e-4 for those tests. 🤷
For now, the key is to get some reasonable proposal that passes CI. In the PR, we'll discuss with a libs person how to best proceed.

LorrensP-2158466 · 2025-03-05T12:56:56Z

Works For me :), Thanks for helping!

This was referenced Feb 25, 2025

Miri subtree update rust-lang/rust#137594

Merged

Use non-deterministic precision for float operations that do not have guaranteed precision #3555

Open

tgross35 mentioned this issue Feb 26, 2025

f.sin().asin() sometimes returns NaN #4207

Closed

RalfJung added C-bug Category: This is a bug. A-shims Area: This affects the external function shims labels Feb 27, 2025

rustbot assigned LorrensP-2158466 Feb 28, 2025

LorrensP-2158466 mentioned this issue Mar 5, 2025

Enable Non-determinism of float operations in Miri and change std tests rust-lang/rust#138062

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Miri float ops are less precise than what the standard library expects #4208

Miri float ops are less precise than what the standard library expects #4208

RalfJung commented Feb 25, 2025 •

edited by rustbot

Loading

Jules-Bertholet commented Feb 25, 2025

RalfJung commented Feb 25, 2025

Jules-Bertholet commented Feb 25, 2025

RalfJung commented Feb 25, 2025

Jules-Bertholet commented Feb 25, 2025

RalfJung commented Feb 25, 2025

Jules-Bertholet commented Feb 25, 2025

RalfJung commented Feb 25, 2025

Jules-Bertholet commented Feb 25, 2025

tgross35 commented Feb 26, 2025 •

edited

Loading

RalfJung commented Feb 26, 2025

RalfJung commented Feb 26, 2025

tgross35 commented Feb 26, 2025

RalfJung commented Feb 26, 2025

RalfJung commented Feb 27, 2025

LorrensP-2158466 commented Feb 27, 2025

RalfJung commented Feb 27, 2025

LorrensP-2158466 commented Feb 27, 2025

tgross35 commented Feb 27, 2025 •

edited

Loading

RalfJung commented Feb 27, 2025

Jules-Bertholet commented Feb 27, 2025

LorrensP-2158466 commented Feb 27, 2025

RalfJung commented Feb 28, 2025 •

edited

Loading

LorrensP-2158466 commented Feb 28, 2025

LorrensP-2158466 commented Mar 1, 2025

RalfJung commented Mar 3, 2025 •

edited

Loading

LorrensP-2158466 commented Mar 5, 2025

beetrees commented Mar 5, 2025

RalfJung commented Mar 5, 2025 via email

LorrensP-2158466 commented Mar 5, 2025

LorrensP-2158466 commented Mar 5, 2025

RalfJung commented Mar 5, 2025

LorrensP-2158466 commented Mar 5, 2025

Miri float ops are less precise than what the standard library expects #4208

Miri float ops are less precise than what the standard library expects #4208

Comments

RalfJung commented Feb 25, 2025 • edited by rustbot Loading

Jules-Bertholet commented Feb 25, 2025

RalfJung commented Feb 25, 2025

Jules-Bertholet commented Feb 25, 2025

RalfJung commented Feb 25, 2025

Jules-Bertholet commented Feb 25, 2025

RalfJung commented Feb 25, 2025

Jules-Bertholet commented Feb 25, 2025

RalfJung commented Feb 25, 2025

Jules-Bertholet commented Feb 25, 2025

tgross35 commented Feb 26, 2025 • edited Loading

RalfJung commented Feb 26, 2025

RalfJung commented Feb 26, 2025

tgross35 commented Feb 26, 2025

RalfJung commented Feb 26, 2025

RalfJung commented Feb 27, 2025

LorrensP-2158466 commented Feb 27, 2025

RalfJung commented Feb 27, 2025

LorrensP-2158466 commented Feb 27, 2025

tgross35 commented Feb 27, 2025 • edited Loading

RalfJung commented Feb 27, 2025

Jules-Bertholet commented Feb 27, 2025

LorrensP-2158466 commented Feb 27, 2025

RalfJung commented Feb 28, 2025 • edited Loading

LorrensP-2158466 commented Feb 28, 2025

LorrensP-2158466 commented Mar 1, 2025

RalfJung commented Mar 3, 2025 • edited Loading

LorrensP-2158466 commented Mar 5, 2025

beetrees commented Mar 5, 2025

RalfJung commented Mar 5, 2025 via email

LorrensP-2158466 commented Mar 5, 2025

LorrensP-2158466 commented Mar 5, 2025

RalfJung commented Mar 5, 2025

LorrensP-2158466 commented Mar 5, 2025

RalfJung commented Feb 25, 2025 •

edited by rustbot

Loading

tgross35 commented Feb 26, 2025 •

edited

Loading

tgross35 commented Feb 27, 2025 •

edited

Loading

RalfJung commented Feb 28, 2025 •

edited

Loading

RalfJung commented Mar 3, 2025 •

edited

Loading