-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
f32 formatter produces strange results #63171
Comments
The bytes put into the binary are correct, so it is indeed an issue in formatting routines. |
This is because the floating point formatter prefers rounder numbers as long as it roundtrips back to the same float. For large floating point numbers, a more accurate string representation can be obtained by adding a precision (even a precision of 0 forces the value to be precise up to the decimal point). fn main () {
println!("{}", 1e100);
println!("{:.0}", 1e100);
}
|
I think that's very confusing behavior, fwiw. |
It's basically applying the same logic to digits before the decimal as it does to those after. The issue is, rust actually shows all the digits before the decimal. I am wishing more and more that this could just get formatted to |
nitpick: a more precise string representation, not a more accurate one. |
This comment has been minimized.
This comment has been minimized.
We discussed this in the recent Libs meeting and would be open to try fixing this up. It is confusing behavior to treat digits before the decimal the same as those after. |
Respectfully, I do not agree that whether the digits are before or after the decimal is meaningful: I think the only thing that is meaningful is accurately displaying the number of significant digits. Once the decimal formatter reaches the number of significant digits the source float can be said to meaningfully represent, all others should be zero, as to print anything else is to give the false impression that displaying such values represents a different value from Thus, I sympathize with ExpHP. The way to represent this, and incidentally also a valid formatting convention for C and/or Rust, is the XeY style of scientific notation. This is the way it is handled in many everyday calculators, whether they use IEEE754 floats or not. And IEEE754-2019 actually specifies a floating point implementation may have a limitation on the number of digits that can be correctly rounded when handling such binary-decimal conversions, and that limit (called "H") should be fairly high by default. Specifically, for what we call H should be unbounded, according to IEEE754, but it also specifies that what H functionally represents ("significant digits plus a bit of padding") is a parameter to the formatting operation which can be specified in a language-dependent way. So I believe a Rust programmer should always be able to somehow request that H be considered ∞ and that, by default, when outputting floating point we should use the minimum limit of H, because past that, as scottmcm noted, we only gain precision, not accuracy. |
This is incorrect. Firstly, innaccuracies in floating point values are caused by conversions and/or computations on those values. A value on its own is completely precise. The value
It is standard when you are doing computation on uncertain values. If I add two values of 3 SF each, of course it makes no sense to show the result to more than 3 SF, because in this case I am not trying to represent a single number, I'm representing a (simplified) distribution. The
I think this is fine once numbers get larger than is reasonable to display the usual way, but when they are displayed in decimal, they should not be rounded in this way. |
Your example is a 10 digit decimal, and thus needs to be printed fully according to the standard, as I said. I am very interested if you have a 13 digit example. |
The point of what I said is that by the time the value may be printed imprecisely then the distance between the values, the points that correspond to the integer solution to the sign, exponent, and mantissa equation, has become considerably greater than the value that may be omitted. Permitting such for values within the range of an equivalent integer seems unwise. Happily, it is not allowed in IEEE754 standard, which absolutely bars printing an f32 equal to or less than 12 decimal digits with anything but a completely precisely calculated printing. But beyond that, printing the last decimal digit thus is actually functionally moot at those scales. Again, it only matters at the minimum of 13 decimal digits for an The assertion that floating point numbers are precise until computed with is awkward, because all floating point numbers are subject to the rounding function, including when parsed from decimals in source. That IS a computation that is subject to rounding rules. Thus all the decimal sequences that would be rounded to a given float value, and the one that it happens to "exactly" correspond with, are not actually discernible, which is the point. If we wished we could offer such facilities in Rust to discern such slightly rounded values apart from precise ones and indeed that is extensively recommended by IEEE, but that is injecting an additional data point. I presume if someone directly specifies a floating point value from its bit format using |
Sure, any power of 2 within the exponent range is exactly representable in an f32. eg.
The problem is that setting it to zero is not the same as omitting the value. Omitting the value would indicate that the remaining digits are unspecified. Setting them to zero just produces the wrong number.
I disagree: printing the wrong decimal digit is just wrong, and we don't have the context to argue that it's a moot point, since that depends on the specific program.
Converting from decimal is a calculation that can introduce error, but not all numbers are parsed from decimals, and those that are may be completely within the range where all whole numbers can be represented.
I'm not quite sure what you're saying here: do you mean add a function to determine in advance if a value would be rounded to a power of ten when formatted? I don't think that really addresses the issue of the value being formatted actually being wrong. Just to give some concrete examples:
|
Is there any progress on this? I just encounter this, which seems to be a fn main() {
let f = 1655640002809605600.0f64;
println!("{}", f);
println!("{:.0}", f);
println!("{:.20}", f);
} produces
This is very confusing behavior imo because I used |
FWIW I also encountered this while debugging the use of a 3rd party game library that deals with let x: f32 = 0.25 - 5000000.0;
println!("{:.2}", x); // -5000000.00 Casting or binding to a new println!("{:.2}", x as f64); // -5000000.00 |
@gliderkite Yours is different from this issue. f32 cannot distinguishably store fn main() {
let x: f32 = 0.25 - 5000000.0;
let y: f32 = -5000000.0;
assert_eq!(x, y); // will pass
} (this applies to all languages that use the same IEEE float 32) Your idea of using f64 is good, but since this is not a formatting problem, you shouldn't apply it at formatting code, but use all f64 from the beginning fn main() {
let x: f64 = 0.25 - 5000000.0;
let y: f64 = -5000000.0;
assert_eq!(x, y); // this will panic
} |
I ratify Diggsey’s comments. As specified by IEEE 754, each floating-point datum that is not a NaN or infinity represents one number exactly. Thus, printing more non-zero digits gains accuracy, not just precision: The exact number represented by a floating-point datum is uniquely the one represented in decimal by showing all significant digits (all digits from the first non-zero digit to the last non-zero digit). IEEE 754 says that floating-point arithmetic approximates real arithmetic (IEEE 754-2019 3.2, first sentence). When any operation is performed, except where stated otherwise in the standard, it is as if the result were computed exactly and then rounded to a value representable in the destination format (according to rounding rules described in the standard). So floating-point arithmetic approximates real-number arithmetic, but each result obtained is a specific number. This is crucial to the floating-point model:
Converting numbers to decimal (in a string of human-readable characters) is an operation that should be computed as described above, and the ideal result is to produce the exact result representable in the destination format. Allowing an implementation limit on how many significant digits can be computed correctly was largely a nod to feasibility, not an inherently desirable feature. In light of this model, we can consider several forms of conversion: For a conversion of binary floating-point to an output format of “decimal numeral,” the input number is always exactly representable in the output format, so that should be the result. For a conversion of binary floating-point to an output format of “decimal numeral of n digits,” the output should be the n-digit decimal obtained by rounding the input number using the selected rounding rule. (This includes output formats of “decimal numeral with up to n digits” where the final number of digits is determined by removing trailing insignficant zeros, as these result in the same output numbers as "decimal numeral of n digits,” just with a different representation.) Another conversion operation is to convert a binary floating-point number to a decimal numeral with just enough digits to uniquely identify it among the numbers representable in the input format. This is not well described as a conversion to a specific output format due to the complexity of the output set. When we choose to display numbers in a form convenient for humans to interpret, the format choices are best left to the application. The core floating-point work ought to be done as accurately as the format(s) permit, and choices about abbreviating numbers (and introducing more rounding error) should be left to the application program to decide for its purposes. Many of the frustrations people have with floating-point arithmetic can be attributed at least in part to displays of decimal values that differ from the actually represented value. Displaying “0.1” instead of “0.100000001490116119384765625” misleads the reader about what is happening in the program. Note that inaccuracies in floating-point arithmetic cannot be a reason for underlying software (such as the programming language or a general string conversion/display library) to limit how many digits are displayed. For example, sometimes people reason that a floating-point format is only “accurate” to 15 digits, so only 15 digits should be displayed. This is not a correct criterion because the software converting to decimal has no information information about how accurate its input operand is or is not. It may be a number that is exact, because it was received exactly and no operations were performed on it or because it is the result of specialized calculations. Alternatively, the number might be the result of a long sequence of operations involving many roundings. The error induced by rounding in each operation can compound or cancel, and the final result of a sequence of floating-point operations can be exactly correct or can be wrong by an unbounded amount, in which case even the first digit might be incorrect. If a conversion routine were to limit the digits it produced to only those it knew to be correct, it could never produce any digits, since it knows nothing about how many digits are correct. So the fact that floating-point arithmetic introduces inaccuracies provides no basis for determining how many digits ought to be displayed. This burden lies with the application programmer; numerical analysis of the potential error requires knowledge of the algorithms and data. To this end, the underlying software can provide features for rounding numbers to a requested number of digits, but it should not decide to limit the digits itself. |
Test-case:
I expected both to produce the same number, since that number is a perfectly representable
f32
value.This bit me because I was investigating an issue where values close to integer limits were treated as negative.
Seeing
2147483600
(in range for an i32) rather than the actual value2147483648
(out of range for ani32
) made it all really confusing.Anyhow, I'll fix Firefox to do the proper bounds check, but it would've been nice to avoid the extra confusion time :)
The text was updated successfully, but these errors were encountered: