Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug-print negative not-a-number as "-NaN" #54235

Closed
wants to merge 1 commit into from

Conversation

scottmcm
Copy link
Member

5.12.1 allows "with an optional preceding sign" for nans

cc @rkruppe, #53938 (comment)

Posted for "is this a bad idea?" and travis right now. I want to add some doc text and tests about it before it's ready to be merged.

5.12.1 allows "with an optional preceding sign" for nans
@rust-highfive
Copy link
Collaborator

r? @aidanhs

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 14, 2018
@hanna-kruppe
Copy link
Contributor

I'm not sure this is a good idea. The benefit is small because the sign of a NaN is even less important than the sign of zero. For example, 1/0.0 == INFINITY but 1/-0.0 == NEG_INFINITY but none of the arithmetic operations interpret the sign of a NaN (at most it's propagated along with the payload). What's worse, I don't believe LLVM guarantees it preserves the sign of NaNs (just as it doesn't do for signaling bit and payload), so a negative NaN in debug output doesn't have to bear any relationship to the sign that the code otherwise sees.

Additionally, there are potential costs: it could cause some confusion, and like any other formatting change it's risky because third party code may not be prepared to handle it – e.g., there might be code comparing the formatting result to the string "NaN".

@leonardo-m
Copy link

What was the original purpose of giving a sign to NaNs? What's the cost of not printing their sign?

@hanna-kruppe
Copy link
Contributor

What was the original purpose of giving a sign to NaNs?

I don't know the IEEE 754 working group's intent, but if I had to guess I'd say it is just a byproduct of the sign-magnitude encoding, it's not used for anything and in fact explicitly unspecified in most cases. As §6.3 states, the standard "does not interpret the sign of a NaN" except when it comes to "operations on bit strings".

What's the cost of not printing their sign?

Well, people won't see in their debugging output that a NaN has a negative sign bit. As the sign bit is almost entirely meaningless for NaNs (see earlier comment), I can't really imagine this ever making a difference, but you never know.

@scottmcm
Copy link
Member Author

Some do seem to use it; like (-NAN).is_sign_negative() is true. I also noticed that we parse "-NaN" as something with a sign bit, so producing it again in Debug seems logical to me.

(I definitely agree with not having signs on NaN in Display.)

@hanna-kruppe
Copy link
Contributor

Some do seem to use it; like (-NAN).is_sign_negative() is true.

Yes, sorry if I wasn't clear: I only meant it's irrelevant even more often than signed zeroes, not that it never matters. The standard also names negate, abs and copySign as operations that interpret the sign, though for abs that just refers to the result (the sign bit is cleared even for NaN inputs) and I don't think we have a general copySign function in Rust. And then there's to_bits, of course.

I also noticed that we parse "-NaN" as something with a sign bit

Huh, if you told me that yesterday I would have said it's a bug (and I wrote that code!). I am reasonably sure it's unintentional and we don't have tests for it, though of course we can't just change it now.

@nagisa
Copy link
Member

nagisa commented Sep 15, 2018

Some do seem to use it; like (-NAN).is_sign_negative() is true.

is_sign_negative corresponds to isSignMinus in the standard which is a bit-string operation (it gives answers about the binary representation of the floating point number, not the number itself).

@nagisa
Copy link
Member

nagisa commented Sep 15, 2018

I do not have an opinion in terms of what the Debug impl should do, though. On one hand more information is good, on the other hand if we are printing the sign, why not the payload as well?

@scottmcm
Copy link
Member Author

on the other hand if we are printing the sign, why not the payload as well?

Because these are different paragraphs in 5.12.1:

Conversion of a quiet NaN in a supported format to an external character sequence shall produce a language-defined one of “nan” or a sequence that is equivalent except for case ( e.g., “NaN”), with an optional preceding sign. (This standard does not interpret the sign of a NaN.)

Language standards should provide an optional conversion of NaNs in a supported format to external character sequences which appends to the basic NaN character sequences a suffix that can represent the NaN payload (see 6.2). The form and interpretation of the payload suffix is language-defined. The language standard shall require that any such optional output sequences be accepted as input in conversion of external character sequences to supported formats.

@XAMPPRocky
Copy link
Member

Triage; @aidanhs Hello, are you able to review this PR?

@TimNN
Copy link
Contributor

TimNN commented Oct 16, 2018

Ping from triage @rkruppe / @scottmcm / @rust-lang/libs: This PR seems to have somewhat stalled. Could you comment on how you think this can move forward?

(Also, would it make sense to have a crater run to see how many crates rely on the formatting of NaN in their tests, even though they probably shouldn't?)

@alexcrichton
Copy link
Member

This seems like a good implementation to me in that Debug should strive to provide as much contextual information to understand the output that it can. I'm not clear though on why we wouldn't want to merge this?

@hanna-kruppe
Copy link
Contributor

As discussed up-thread, although the sign of a NaN is not completely unobservable, it is almost so (only via is_sign_{positive,negative} and to_bits, and the latter also exposes the payload which this PR doesn't include in Debug). I've seen bugs caused by the distinction between negative and positive zero, but I haven't seen, and can't imagine, bugs caused solely by the sign of a NaN differing. To me this makes the expected benefit of including the sign in debugging output so minuscle that it's it not worth the risk of breaking user code (crater can give some confidence that doesn't happen, but it doesn't have perfect coverage).

@dtolnay
Copy link
Member

dtolnay commented Oct 16, 2018

I would lean toward not doing this. Debug is for debugging and it's not clear that "-NaN" would aid anyone in debugging anything.

@hanna-kruppe
Copy link
Contributor

As @dtolnay found out (#55131) we do not even produce consistent or predictable signs on NaNs, which is another reason to not show the sign IMO – the output may even be misleading, not matching the values that are actually used for the computations you're trying to debug!

@scottmcm
Copy link
Member Author

I opened this after working on the float total ordering PR, where it looked weird to have some NaNs on one side of the actual numbers, and some on the other side. Basically, the more functions we add that do care about signs on NaNs, the more reasonable this is.

But I'm not all that attached to it; if libs decides it's better closed that's fine by me.

@alexcrichton
Copy link
Member

I personally ran into this awhile ago when working with spec tests for wasm, where the spec verifies actual bit patterns of floats and requires you to produce negative NaN in some situations, and then verifies that.

I don't really understand the harm here in printing a - sign out in front so I'd be inclined to merge. I agree it's rare that this would need to be cared about but this is a debug representation, and we shouldn't be hiding information that "isn't used most of the time" but rather providing all the information necessary to debug.

@sfackler
Copy link
Member

Should we also be including the signal bit and payload?

@alexcrichton
Copy link
Member

While I don't think we'd go out of our way to implement that I wouldn't personally reject a PR doing so. There's a canonical encoding for NaN that almost everything uses which we could render as NaN, and then alternate encodings could use more verbose syntax indicating the various other bits

@hanna-kruppe
Copy link
Contributor

@sfackler If we were to decide we want to truly include all information that we possibly can, then yes, certainly the signaling bit and payload should be included as well -- they too are observable using to_bits, as that would be the only consistent choice. However, it would exabercate the concern that it breaks consumers of the Debug output, since we'd have to write something like -sNaN(0x123...) which will break even parsers that consume the sign before matching the string against "NaN" (and therefore would be fine with "-NaN").


@alexcrichton Thank you for pointing out that debugging output could also be used while tracking down a difference in float bit patterns rather than float numerical values or program behavior caused by numerical values. Keep in mind though that in those cases the sign bit is just a small part of the picture and the programmer also needs to see the signaling bit and payload, so I'm a bit surprised to see you apparently less enthusiastic about including that information than about including the sign.

Still, I think you're not doing my position justice when you summarize it as just "isn't used most of the time". It's not that the sign bit of a NaN can't be relevant for debugging, clearly it can be, it's just that this benefit is real but very small so it does not outweigh the (also small) risk of breaking user code. People implement their own string -> float conversions, they compare Debug output in integration and unit tests, they write values to logs that they may then scan with regexes, etc. which makes any sort of change to the Debug format carry a risk of breaking user's code. While I haven't looked and found an actual instance of it, it seems plausible enough for me to count as a small but non-zero mark against printing "-NaN".

If this change were proposed pre-1.0, a little breakage would not be a concern and I would welcome this PR, but at this point in time tweaking the Debug output format doesn't seem worth risking for the (real, but extremely rare) improvement in debugging experience to me. Ultimately this is a judgement call and I'm happy to let the libs team make it, I just want to make sure we all agree about what things are on each side of the scale. From your comments here so far I am not sure whether you've fully recognized and weighed everything I see on the scale. If you have and just see the scales tip the other way, great.

@alexcrichton
Copy link
Member

Sorry I don't mean to discount anything. If we want to close this I'm all for that, I was just trying to provide a counterpoint. I don't really feel that strongly about this, and seems like others want to close.

@rfcbot fcp close

@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Oct 20, 2018
@alexcrichton
Copy link
Member

@rfcbot fcp close

@rfcbot
Copy link

rfcbot commented Oct 20, 2018

Team member @alexcrichton has proposed to close this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-close This PR / issue is in PFCP or FCP with a disposition to close it. labels Oct 20, 2018
@Kimundi
Copy link
Member

Kimundi commented Oct 25, 2018

Hang on, I might have missed something in the discussion above, but haven't we always said that the exact output of a Debug impl is something you should not rely on anyway? We shouldn't change it arbitrarily, sure, but treating it as stable also seems weird.

@rfcbot rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label Oct 27, 2018
@rfcbot
Copy link

rfcbot commented Oct 27, 2018

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Oct 27, 2018
@dtolnay
Copy link
Member

dtolnay commented Oct 27, 2018

Correct, we are free to change Debug representations including this one if we wanted to.

@alexcrichton
Copy link
Member

Ok I'm gonna close this for now, but as mentioned above by @Kimundi and @dtolnay I think it's still in the cards if we want it later!

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Nov 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-close This PR / issue is in PFCP or FCP with a disposition to close it. finished-final-comment-period The final comment period is finished for this PR / Issue. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.