Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-0.0 should format with a minus sign by default #1074

Closed
rprichard opened this issue Apr 20, 2015 · 28 comments
Closed

-0.0 should format with a minus sign by default #1074

rprichard opened this issue Apr 20, 2015 · 28 comments
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.

Comments

@rprichard
Copy link

I think negative zero should be formatted with the minus sign preserved by default.

Issue 20596

Issue #20596 restored the sign for Debug but not for Display, because "for users, negative zero makes no sense." I feel this is a poor rationale:

  • We don't know that the user is non-technical. For exponent-form, the user might very well be technical.
  • FP is inherently non-user-friendly. Printing large and small numbers accurately and concisely requires exponent form (issue #24556) which may be unfamiliar to some users. We must also print [+-]Infinity and NaN.
  • The formatter traits are used for more than UI output. to_string is Rust's canonical string representation. It is used to serialize a float for JSON. {:10} is used to serialize floats for CSV output.

The issue indicated that the CSS serializer could instead use Debug. There are problems with using Debug for FP serialization:

  • Debug is inflexible. It should be fixed per issue #24556 to use minimal sufficient precision, but searching for this representation is non-trivial. For efficiency, familiarity, or interoperability, programs might want the exponent form {:1.8e} for serialization (for f32). This format mirrors the %1.8e format in C and other languages, but whereas C round-trips every value, Rust round-trips every value but minus zero.
  • Debug is for debugging. Serialization formats need to be stable and documented. We would need to document Debug's output and guarantee its backward compatibility.
  • Display and to_string are the default conversion APIs, so if the default suppresses -0, libraries will take this as a cue that the Rust ecosystem ought to suppress -0. In other words, it is unclear that the existing JSON and CSV serializers will change to Debug. Perhaps this is the intent; it's unclear. Outside Rust, the JSON serializers I've tested for Python, Ruby, Java (gson), and Go all output minus zero.

It is surprising that {:+} prints +0, while {:+?} prints -0 for the same number. They appear to contradict.

Other languages

Hiding minus zero makes Rust inconsistent with many other languages. Languages that show it include: C, C++, Go, Haskell, Java, Lua, OCaml, Perl, Python, Racket, and Ruby. As a systems language, it makes sense to do the same thing as C/C++, but even common Unix scripting languages show the value.

Go and Perl: the default print hides minus zero, whereas Perl's printf and Go's fmt package show it. Go's print outputs floats using the simple format [+-]N.NNNNNNe[+-]NNN; it's not trying to be user-friendly or sufficiently precise.

JavaScript hides minus zero, but it has no separate integer type. If a language must choose between preserving the illusion of an integer type or representing IEEE-754 precisely, it's reasonable to choose the former. Rust, on the other hand, has FP types. Moreover, even JavaScript's choice isn't obvious—Lua lacks integer types but still prints -0.

C# hides the minus sign.

Neither JavaScript nor C# have a built-in printf-style interface, nor do they display -0 with a + sign.

cc @rkruppe @lifthrasiir @aturon @Diggsey @SimonSapin

@nagisa
Copy link
Member

nagisa commented Apr 20, 2015

I feel pretty strongly about this and believe we should not display sign for minus zero for Display by default.

It is surprising that {:+} prints +0, while {:+?} prints -0 for the same number. They appear to contradict.

I filled rust-lang/rust#24623

@Diggsey
Copy link
Contributor

Diggsey commented Apr 20, 2015

I'm not too hung up on which is the default behaviour, but I'd at least want to have a way to print a float such that it look like a true real number, ie. never display a sign for +/-0.

Generally, if I see +0 or -0 in a UI, I'm going to think that it's actually just a very small number which is zero when rounded but actually does have a sign, whereas with no sign it's obvious that it's actually zero..

@rprichard
Copy link
Author

I'm fine with adding some way to round -0 up to +0 during formatting. format_args! accepts a - flag that is currently unused; perhaps it's an option. It would be nice to see how this problem is solved in the languages that print -0.0.

It's not quite reasonable to expect Display to show a non-technical-friendly output. It can (or should) print values like 1e100, 0.30000000000000004, NaN, +Infinity. It doesn't print the first two currently, but issue #24556 suggests we do. Assuming we do, I suspect we really need something like %f to restore decimal format. At one point, we had a Float formatter trait. I think it was removed during the stabilization effort and/or the Show -> Display/Debug split.

We also implement Display for types like Ipv6Addr, even though non-technical end users likely cannot read an IPv6 address.

Python, Ruby, and Racket each have a to-display-string versus to-debug-string facility (__str__ vs __repr__, to_s vs inspect, display vs print, respectively). They use -0.0 in all six cases.

Issue #24624 illustrates the problem with using Debug as a serialization format -- the Debug format does not seem stable.

@nagisa
Copy link
Member

nagisa commented Apr 20, 2015

I think somebody misrepresented what Display does somewhere. It is supposed to be implemented for types which have an obvious, widely accepted string representation. This includes numbers, IP address and port pairs (because there exists a widely accepted convention on how to represent them as a string) and obviously strings themselves.

It has nothing to do with end users whatsoever. Or at least didn’t when it was still called String.

@rprichard
Copy link
Author

I suppose the term is "user-facing" rather than "end-user". It's from RFC 565. It also includes this bit:

Require use of an explicit adapter (like the display method in Path) when it potentially looses significant information.

@bombless
Copy link

I like this idea but 0.0f64 == -0.0f64 looks a bit weird.

@lilyball
Copy link
Contributor

FWIW, Swift has a distinction between Printable and DebugPrintable (which seems roughly equivalent to our Display vs Debug), and for floating-point values it displays -0.0 for Printable (it renders identically with toString() vs toDebugString() in fact). Curiously, NSNumber renders it as -0 instead (no decimal).

My initial reaction to this was a knee-jerk "no it shouldn't!" but I think it actually should. @nagisa is correct, Display is not defined as "the correct string representation to present to end-users". It's merely defined as the semantic string that represents a value (and in fact the std::fmt docs say "fmt::Display implementations assert that the type can be faithfully represented as a UTF-8 string at all times."). In light of that, I think -0.0 is absolutely correct here. If you want to present numbers to end-users you need to use a facility designed for that. Although I would support the addition of a :f modifier to fmt that tries to render floating-point values in a "simpler" form (i.e. drop the sign on -0.0, and drop any trailing 0s, which means for a .0 drop the decimal entirely, and when a precision is given, trim off trailing zeros, which would let me use "{:.6f}" to format a float for human consumption without having "too much" precision).

An alternative would be to use the # flag for floats to mean "precise representation", i.e. include the - on -0.0, and double-check how we handle edge cases like subnormals (how do we render those today?). But I don't think this is the ideal approach.

@aturon
Copy link
Member

aturon commented Apr 27, 2015

I don't have a strong opinion here, but I largely agree with @kballard's analysis: you should probably be using an adapter or modifier or some other facility to tailor the output of things like floating point numbers if you're going to display them directly to a user.

cc @alexcrichton @wycats

@alexcrichton
Copy link
Member

I also have pretty few strong opinions in this area.

@wycats
Copy link
Contributor

wycats commented Apr 28, 2015

My feeling is that -0.0 is not a concept that "humans" (or even many programmers) understand, so if you're formatting it for human consumption, leave off the -. In other words, I would expect it to be there in Debug, but not Display.

@hanna-kruppe
Copy link

-0.0 may be a confusing concept, but it is a reality of the hardware and software we work with. While it's mostly interchangeable with +0.0 (arithmetic, comparisons) it does make a difference often enough:

  • x / -0.0 is negative infinity, which naturally has far-reaching consequences.
  • sign_num(-0.0), is_sign_positive(), etc. consider it negative which can lead to different results in code that works with the signs of numbers in complicated ways.

Therefore, I think it's helpful to point out negative zeros as clearly as possible. I'd rather have a small number of people slightly confused or amused by the "meaningless" sign than a small number of people waste hours chasing a stupid bug caused by a negative zero. It doesn't even matter if they know that there is such a thing as negative zero, even an IEEE-754 expert can be mislead if their debugging output indicates positive zero.

Aside: If {} prints 0.0 for negative zero, I think that means it treats it as positive, and so {:+} would print it as +0.0 for consistency. Thus even a sign-aware programmer can be mislead by using {:+} and thinking that takes care of negative zero.

@erickt
Copy link

erickt commented Apr 28, 2015

I share @rkruppe's view. This would be a hindrance to doing numerical / scientific computing in Rust. I would prefer there to be an option to suppress printing -0.0 if that's desired. Also inconvenient for serialization, as I expected that float.to_string() would do the right thing here for serde json.

@bombless
Copy link

Maybe we can add {:f} formatter to control floating point number formatting.
See also rust-lang/rust#24556 (comment)

@petrochenkov petrochenkov added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Jan 29, 2018
@martin-t
Copy link

martin-t commented Feb 21, 2019

What's the status of this issue? I ran into it when parsing a file containing negative zeros (as -0) and printing it back didn't produce identical output. It's very surprising behavior, especially given println!("{:+}", -0.0); prints +0 (rust-lang/rust#24623) which at first made me think rust didn't support negative zeros at all.

I couldn't find a formatting parameter that would give me the desired output (include minus but with no decimal point) so I have to wrap my floats into a struct when printing and implement custom Display. The docs say - is unused - maybe it could be used for this purpose?

Realistically, can the behavior of {} even be changed now without causing massive breakage in the ecosystem?

@fstirlitz
Copy link

fstirlitz commented Feb 21, 2019

to_string is Rust's canonical string representation. It is used to serialize a float for JSON. {:10} is used to serialize floats for CSV output.

That feels quite wrong. The documentation of Display specifies that it's supposed to be used for 'user-facing output'; I interpret this to mean that the API contract specifies only that the string should be human-readable. According to this, it would be perfectly legitimate for Display to print negative numbers using U+2212 MINUS SIGN instead of the hyphen (U+002D), or to use Arabic digits (٠١٢٣٤٥٦٧٨٩) in an Arabic-script locale. Code which uses Display for serialisation into a machine-readable format is already buggy.

Case in point: I have myself reported a bug in an SVG-reading library failing in locales where the comma is the fraction-part separator; as it turns out, it was because for parsing path definitions it used sscanf, a locale-dependent function.

@SimonSapin
Copy link
Contributor

This kind of bug is exactly the reason why we shouldn’t have implicitly-locale-dependent functions.

As to U+2212 MINUS SIGN, it’s not really helping anybody. Users might copy/paste that output into other software that knows to parse U+002D but not U+2212.

@fstirlitz
Copy link

The thing is, one can argue Display::fmt is already an implicitly locale-dependent function, by virtue of being defined in terms of ‘user-facing output’, because the answer to the question of how user-facing output should look like is locale-dependent, and the documentation does not fix any particular locale by instead referring to ‘US-English-speaker-facing output’. If I know that my program is going to be used mostly by Arabic speakers, then I may legitimately choose an alternative implementation of Rust that formats user-facing integers using Arabic digits. The variety of human experience does not go away just because it is not explicitly acknowledged in your programming interfaces.

Users might copy/paste that output into other software that knows to parse U+002D but not U+2212.

Sure, but that is a ‘mere’ quality-of-implementation issue, not a correctness issue. Even when sticking with European digits, one could legitimately make Display use U+2212 for negative numbers, with the expectation that the data so formatted will either not be (often, or ever) copy/pasted in the first place, or that other software will eventually learn to parse U+2212 in user-facing data entry fields.

The real point though is that Display is too underspecified to be used for any serialisation. Assuming more of the API contract than is actually given is the root of all evil. (I very rarely use the word ‘evil’ with respect to technical issues – it has been way overused, with matters as silly as whitespace characters or choice of text editor, to the point of becoming a cliché – but I think this is one of the few appropriate places for it.)

@lilyball
Copy link
Contributor

The module-level std::fmt documentation explicitly states

The format functions provided by Rust's standard library do not have any concept of locale, and will produce the same results on all systems regardless of user configuration.

It's certainly possible that an alternative version of Rust might choose to make changes to the Display implementations for stdlib types, but that would likely be considered an incompatible change; while the stdlib does not document how the floating point primitives have chosen to implement Display, the current behavior is to produce a machine-parseable representation, and various libraries may make assumptions about the particulars of the output (e.g. that it will always match ^-?\d+(\.\d+)?).

@workingjubilee
Copy link
Member

IEEE754-2019 states that transformations from a floating point to an external decimal character sequence and back shall (are required to) preserve signs for infinites and zeros, and this was likely specified also in previous versions (2008) but I have not extensively diffed it. But as a result, I do not really see a reason to debate here.

@SOF3
Copy link

SOF3 commented Oct 28, 2020

@workingjubilee that's the case for Debug already. It's another issue whether it should be in Display.

Did anyone ever guarantee that FromStr and Display are supposed to be inverse of one another?

@Lokathor
Copy link
Contributor

No, in fact while Debug is oriented at "a rust programmer", Display is oriented at "Someone who might not be a rust programmer", so it's rather a lot more likely that FromStr can't parse exactly what Display shows.

  • Debug is often something that can be a Rust literal.
  • Display is often something that can go into a log file somewhere.

@workingjubilee
Copy link
Member

And the log file is exactly the concern. IEEE754 makes it pretty clear that 0.0 and -0.0 are not to be treated interchangeably as data points and should round-trip with sign, and so a logged value by default should be capable of being turned into its source float pattern. It is also confounding that we say that the negative sign is printed by default when here it is not: https://doc.rust-lang.org/stable/std/fmt/index.html

Though it also mentions the Signed trait which... doesn't exist in Rust at the moment, I believe? So that's clearly in dire need of updating either way.

Display should certainly output formatted data in a way that is presentable to an ordinary human being instead of a cybernetically attached headcrab, but I do not think it should unnecessarily assume a presence or absence of technical (or mathematical!) capability in doing so. std in particular does not have enough information to assume users are not mathematicians (which are not programmers by default, nor are programmers necessarily mathematicians), who might be quite conversant with at least some of the technical nuances here. At worst, someone with a fairly basic knowledge of integer arithmetic i.e. below the natural numbers will see a -0 and know that it would also be normatively equal to 0, regardless of the sign on it. To assume less knowledge is to seriously open the question that maybe they don't know what a decimal number is, and then we have to seriously ask whether or not we should format floats at all.

Either way, std's Displays should likely err in the direction of being used in log and error formats, since that is often where they wind up: useful to both the Rust programmer, who is likely to see such in a paste from an error log, and adequately legible to the non-programmer.

@lilyball
Copy link
Contributor

I just did a quick test of the basic printing operation on the literal -0.0 in various languages.

Language Expression Result Preserves sign
Swift print(-0.0) -0.0
Obj-C NSLog(@"%@", @(-0.0)) -0
C (libc) printf("%f", -0.0) -0.000000
C (libc) printf("%g", -0.0) -0
C++ (libc++) std::cout << -0.0 -0
Python 2.7 print -0.0 -0.0
Python 3 print(-0.0) -0.0
Ruby 2.6 puts -0.0 -0.0
Perl 5.28 print -0.0 0
Tcl 8.5 format %f -0.0 -0.000000
Groovy println(-0.0) 0.0
Kotlin println(-0.0) -0.0
JavaScript (browser) console.log(-0.0) -0
JavaScript (nodejs) console.log(-0.0) -0
JavaScript (-0.0).toString() 0

(note that Perl will use libc if you write printf "%f", -0.0)

Most languages seem to be preserving the sign. I think it's pretty reasonable for Rust to follow the majority behavior here and preserve it too.

@SOF3
Copy link

SOF3 commented Oct 29, 2020

I just did a quick test of the basic printing operation on the literal -0.0 in various languages.
Language Expression Result Preserves sign
Swift print(-0.0) -0.0 white_check_mark
Obj-C NSLog(@"%@", @(-0.0)) -0 white_check_mark
C (libc) printf("%f", -0.0) -0.000000 white_check_mark
C (libc) printf("%g", -0.0) -0 white_check_mark
C++ (libc++) std::cout << -0.0 -0 white_check_mark
Python 2.7 print -0.0 -0.0 white_check_mark
Python 3 print(-0.0) -0.0 white_check_mark
Ruby 2.6 puts -0.0 -0.0 white_check_mark
Perl 5.28 print -0.0 0 x
Tcl 8.5 format %f -0.0 -0.000000 white_check_mark
Groovy println(-0.0) 0.0 x
Kotlin println(-0.0) -0.0 white_check_mark
JavaScript (browser) console.log(-0.0) -0 white_check_mark
JavaScript (nodejs) console.log(-0.0) -0 white_check_mark
JavaScript (-0.0).toString() 0 x

(note that Perl will use libc if you write printf "%f", -0.0)

Most languages seem to be preserving the sign. I think it's pretty reasonable for Rust to follow the majority behavior here and preserve it too.

Is there the distinction of something like Debug vs Display in those languages? The motivation for separating Debug from Display is that many languages abuse their default to-string implementation for debugging.

@Lokathor
Copy link
Contributor

Every example there is basically Debug output, and if you wanted alternative formatting you'd do it yourself. Most languages don't have a Debug/Display distinction.

@workingjubilee
Copy link
Member

workingjubilee commented Oct 29, 2020

Swift does. But I don't believe that reasoning based on the idea that there should be a strong divide between Debug and Display is very applicable here even if we were to presume such was a good place to start. That intended divide is why Rust does not provide derivable implementations for Display, forcing the programmer to think carefully about choosing a presentation. Thus people will write println!("{}", val); to output data, and either impl Display or fix it to "{:?}" when a type error occurs. Except in this case they won't because floats do in fact Display. If anything is being "abused", it's our Display formatting abused as Debug formatting.

Having done that, we're patting ourselves on the back a bit much if we are drawing a bright line between "write different format chars in Rust" from "write different format chars in C". And in C they are consistently conformant in this regard to IEEE754.

But as nagisa stated earlier, the question for std implementations of Display are whether it has "an obvious, widely accepted string representation" and we've established strongly that there is a widely accepted convention here: a technical standard and majority compliance with it. And I did go back further and found that IEEE854-1987 is explicit about "-0 should print as -0" and IEEE754-1985 is suggestive of it, so this is not a recent change in 2019 or even 2008.

bors added a commit to rust-lang-ci/rust that referenced this issue Mar 27, 2021
Add IEEE 754 compliant fmt/parse of -0, infinity, NaN

This pull request improves the Rust float formatting/parsing libraries to comply with IEEE 754's formatting expectations around certain special values, namely signed zero, the infinities, and NaN. It also adds IEEE 754 compliance tests that, while less stringent in certain places than many of the existing flt2dec/dec2flt capability tests, are intended to serve as the beginning of a roadmap to future compliance with the standard. Some relevant documentation is also adjusted with clarifying remarks.

This PR follows from discussion in rust-lang/rfcs#1074, and closes rust-lang#24623.

The most controversial change here is likely to be that -0 is now printed as -0. Allow me to explain: While there appears to be community support for an opt-in toggle of printing floats as if they exist in the naively expected domain of numbers, i.e. not the extended reals (where floats live), IEEE 754-2019 is clear that a float converted to a string should be capable of being transformed into the original floating point bit-pattern when it satisfies certain conditions (namely, when it is an actual numeric value i.e. not a NaN and the original and destination float width are the same). -0 is given special attention here as a value that should have its sign preserved. In addition, the vast majority of other programming languages not only output `-0` but output `-0.0` here.

While IEEE 754 offers a broad leeway in how to handle producing what it calls a "decimal character sequence", it is clear that the operations a language provides should be capable of round tripping, and it is confusing to advertise the f32 and f64 types as binary32 and binary64 yet have the most basic way of producing a string and then reading it back into a floating point number be non-conformant with the standard. Further, existing documentation suggested that e.g. -0 would be printed with -0 regardless of the presence of the `+` fmt character, but it prints "+0" instead if given such (which was what led to the opening of rust-lang#24623).

There are other parsing and formatting issues for floating point numbers which prevent Rust from complying with the standard, as well as other well-documented challenges on the arithmetic level, but I hope that this can be the beginning of motion towards solving those challenges.
Herschel added a commit to Herschel/ruffle that referenced this issue Apr 15, 2021
Rust nightly 4/13 allows f64::parse to handle "infinity", case
insensitive. This broke cases such as `Number("Infinity")`, which
should return `NaN` in AVM1.

Additionally, Rust will now print "-0" for negative zero, when
previously it would print "0".

 * Return NaN for inf cases ("inf", "-Infinity", "+INF", etc.)
 * Add a test for `Number("inf")` (this was also incorrect before
   the latest nightly)
 * Add a special case for zero in `f64_to_string` to ensure
   that -0.0 gets coerced to "0".

For more info, see:
rust-lang/rfcs#1074
Herschel added a commit to ruffle-rs/ruffle that referenced this issue Apr 15, 2021
Rust nightly 4/13 allows f64::parse to handle "infinity", case
insensitive. This broke cases such as `Number("Infinity")`, which
should return `NaN` in AVM1.

Additionally, Rust will now print "-0" for negative zero, when
previously it would print "0".

 * Return NaN for inf cases ("inf", "-Infinity", "+INF", etc.)
 * Add a test for `Number("inf")` (this was also incorrect before
   the latest nightly)
 * Add a special case for zero in `f64_to_string` to ensure
   that -0.0 gets coerced to "0".

For more info, see:
rust-lang/rfcs#1074
@ceronman
Copy link

ceronman commented Nov 7, 2021

As of rustc 1.56.1 (59eed8a2a 2021-11-01) This seems to be fixed already.

Both println!("{}", -0.0) and println!("{:?}", -0.0) preserve the sign.

Perhaps this could be closed now?

@dtolnay dtolnay closed this as completed Nov 7, 2021
@lilyball
Copy link
Contributor

lilyball commented Nov 8, 2021

For those following along, this was fixed in rust-lang/rust#78618 in version 1.53.0. The release notes said

{f32, f64}::from_str now parse and print special values (NaN, -0) according to IEEE 754.

Referencing from_str specifically and not the types themselves is a bit confusing, but it's referring to this behavior.

HenrySwanson added a commit to HenrySwanson/rust-lox that referenced this issue Dec 24, 2021
Also, apparently updating Rust fixes my last test! Can't commit that
change but w/e, I'm happy anyways :)

rust-lang/rfcs#1074
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests