Skip to content

Float printing and/or parsing is inaccurate #24557

Closed
@hanna-kruppe

Description

@hanna-kruppe

This example program produces a frighteningly large amount of numbers that, without even leaving the save haven of [0.0; 1.0), change when printed out and read back in:

#![feature(rand)]

use std::rand::{Rng, thread_rng};

const N: u32 = 30;

fn main() {
    let mut r = thread_rng();
    let mut errors = 0;
    for i in 0..N {
        let x0 = r.next_f64();
        let x1 = format!("{:.17}", x0).parse().unwrap();
        if x0 != x1 {
            //println!("{:.17} {:.17} {:e}", x0, x1, x0 - x1);
            errors += 1;
        }
    }
    println!("Found {} non-round trip-safe numbers among {} random numbers", errors, N);
}

Not all numbers fail to round trip, but in all my trials a majority did. The error is in the order of 1e-16 (much more, around 1e-6, for the default {} formatting) for all outputs I looked at, but that is more than IEEE-754 permits (it requires correct round trips when 17 decimal digits are printed) and certainly more than users should be forced to endure. Perfect round tripping is possible, useful, and important --- it's just a pain to implement.

I have recently worked my way through float formatting, and at the very least it looks pretty naive (as in, code and comments read as if the authors didn't care at all about rounding error, or were entirely unaware that it's a thing). I also just skimmed over the FromStr implementation and it looks a bit smarter, but since there is no reference to any paper and it doesn't use bignums I have doubts about its accuracy as well.

Many numbers reach a fixed point after one round trip, but there are also numbers which take hundreds, thousands, or even more round trips before they settle down --- if they settle down at all (some cycle). The largest I found was almost nine million round trips, which is almost absurd. For obvious reasons, the error changing (presumably, growing) over time is even worse than a one-time inaccurate rounding. Here's the program I used to find those.

For the formatting side of things, I also filed #24556

I know less about the parsing side of things. The topic seems to get less attention in general, but I found a paper by William Clinger which seems promising.

cc @rprichard

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions