-
Notifications
You must be signed in to change notification settings - Fork 12.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LangRef] Specify NaN behavior more precisely #66579
Conversation
@llvm/pr-subscribers-llvm-ir ChangesThe current docs don't say anything about the possible set of values returned by a NaN-producing operation. However, there are some coding patterns where such guarantees are strongly desired, such as NaN boxing (as used e.g. in the Spidermonkey JS engine). And in fact the set of possible payloads that can be observed today when compiling code via LLVM is fairly limited (based on what optimizations and backends and hardware currently do) -- but without a documented guarantee, languages compiling to LLVM cannot really rely on this. There was a long discussion about the exact shape that the guarantees should take on Discourse. This PR transcribes what I perceived as the consensus, plus some further clarifications. The point I am least sure about is the status of the min/max operations on x86: the operations provided by hardware do not match the IEEE spec, so the backend has to emit a sequence of instructions that implement the IEEE operations; the bitwise operations used by the backend to achieve this could run afoul of the NaN guarantees. Here @jyknight stated that what the backend does might be compatible with the NaN guarantee -- but they didn't say that it affirmatively is compatible, so it would be great if someone with knowledge of the x86 backend could confirm or deny this.Full diff: https://github.com/llvm/llvm-project/pull/66579.diff 1 Files Affected:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index f542e70bcfee810..021e7ba2fb41722 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3394,17 +3394,41 @@ Floating-Point Environment
The default LLVM floating-point environment assumes that traps are disabled and
status flags are not observable. Therefore, floating-point math operations do
not have side effects and may be speculated freely. Results assume the
-round-to-nearest rounding mode.
+round-to-nearest rounding mode, and subnormals are assumed to be preserved.
+Running default LLVM code in an environment where these assumptions are not met
+can lead to undefined behavior.
+
+The representation bits of a floating-point value do not mutate arbitrarily; if
+there is no floating-point operation being performed, the NaN payload (if any)
+is preserved.
+
+When a floating-point math operation produces a NaN value, the result has a
+non-deterministic sign. The payload is non-deterministically chosen from the
+following set:
+
+- The payload that is all-zero except that the ``quiet`` bit is set.
+ ("Preferred NaN" case)
+- The payload of any input operand that is a NaN, bit-wise ORed with a payload that has
+ the ``quiet`` bit set. ("Quieting NaN propagation" case)
+- The payload of any input operand that is a NaN. ("Unchanged NaN propagation" case)
+- A target-specific set of further NaN payloads, that definitely all have their
+ ``quiet`` bit set. The set can depend on the payloads of the input NaNs.
+ This set is empty on x86 and ARM, but can be non-empty on other architectures.
+ (For instance, on wasm, if any input NaN is not the preferred NaN, then
+ this set contains all quiet NaNs; otherwise, it is empty.
+ On SPARC, this set consists of the all-one payload.)
+
+In particular, if all input NaNs are quiet, then the output NaN is definitely
+quiet. Signaling NaN outputs can only occur if they are provided as an input
+value. For example, "fmul SNaN, 1.0" may be simplified to SNaN rather than QNaN.
Floating-point math operations are allowed to treat all NaNs as if they were
-quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0. This also
-means that SNaN may be passed through a math operation without quieting. For
-example, "fmul SNaN, 1.0" may be simplified to SNaN rather than QNaN. However,
-SNaN values are never created by math operations. They may only occur when
-provided as a program input value.
+quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0.
Code that requires different behavior than this should use the
:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
+In particular, constrained intrinsics rule out the "Unchanged NaN propagation" case;
+they are guaranteed to return a QNaN.
.. _fastmath:
|
llvm/docs/LangRef.rst
Outdated
|
||
The representation bits of a floating-point value do not mutate arbitrarily; if | ||
there is no floating-point operation being performed, the NaN payload (if any) | ||
is preserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
floating point bits do mutate arbitrarily on 32-bit x86:
here, all I do is bitcast 0x7f800001
to float
, return it from a function, then bitcast back to an integer -- that changes it to 0x7fc00001
.
https://clang.godbolt.org/z/sjaq55q3G
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's a bug in the x86-32 backend / calling convention.
Given that this bug is hard-coded in the calling convention, I wonder if it should be called out here. But the main point of this sentence is that LLVM itself will not do anything that would mutate FP values.
In fact I assume one can get LLVM to miscompile code by assuming that return values don't change their bits, similar to #44218.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think we should at least mention the major ways in which LLVM backends violate the spec:
- x86-32 with SSE2 enabled may implicitly convert floating-point values returned from a function to x86_fp80 and back; such conversions may signal a floating-point exception, or modify the value.
- x86-32 without SSE2 may do such conversions just about anywhere.
- older MIPS versions use the opposite polarity for the sNaN vs qNaN bit, and LLVM does not correctly represent this (this is at least potentially fixable, unlike the above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a paragraph with backend problems and linked to relevant issues. Is there an issue for the "x86-32 return value problem"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not aware of one in LLVM, only in rust, so I filed a new one, #66803.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is worth calling out somewhere that qNaN/sNaN are defined using the convention that leading bit = 0 is sNaN, 1 is qNaN, and systems that do not adhere to this convention (e.g., old MIPS) are unsupported.
Also probably worth calling out that things like bitcast
do not count as floating-point operations.
llvm/docs/LangRef.rst
Outdated
@@ -3394,17 +3394,41 @@ Floating-Point Environment | |||
The default LLVM floating-point environment assumes that traps are disabled and | |||
status flags are not observable. Therefore, floating-point math operations do | |||
not have side effects and may be speculated freely. Results assume the | |||
round-to-nearest rounding mode. | |||
round-to-nearest rounding mode, and subnormals are assumed to be preserved. | |||
Running default LLVM code in an environment where these assumptions are not met |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"default LLVM code" is not a well-defined phrase :-)
You should explicitly refer to the role of the strictfp
attribute here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mentioning strictfp
now, I hope what I am saying makes sense. :)
llvm/docs/LangRef.rst
Outdated
there is no floating-point operation being performed, the NaN payload (if any) | ||
is preserved. | ||
|
||
When a floating-point math operation produces a NaN value, the result has a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prepend "Unless otherwise specified here", and add notes to fneg
, llvm.fabs
, and llvm.copysign
that they are guaranteed to not affect the NaN payload or qNaN/sNaN status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those 3 shouldn't be considered floating point math operations (as the term is used here and before). We definitely need to make it clear in their definitions -- it has implications beyond just the NaN behavior.
In particular, they all ought to be specified as: "This operation never raises a floating-point exception, and the result is an exact copy of the input, other than the sign bit. It is not considered a floating-point math operation, but rather, bit-manipulation which operates on floating-point values."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The paragraph above says that FP exceptions cannot be observed anyway, so it seems confusing/misleading to now talk about some operations not raising FP exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am explicitly calling out bitcasts and fnet, fabs, copysign as not being affected by the NaN non-determinism now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see some text added to the specification sections for the fneg/fabs/copysign operations. Noting there that exceptions cannot be signaled (in addition to them only touching the sign bit) is meaningful -- it means that you may use them in a "strictfp" function. Unlike other operations, they do not need a "constrained" variant, because their behavior is fully defined regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still add "unless otherwise specified", because you have cases like llvm.canonicalize
which is an FP operation that can arbitrarily mutate NaN payloads but requires the output to be qNaN, even if the input is sNaN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. That can then also cover fmin/fmax, should we need to add an exception for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see some text added to the specification sections for the fneg/fabs/copysign operations.
Done.
…ed; add list of known-broken cases
d4a0c76
to
68a32c0
Compare
.. _floatnan: | ||
|
||
Behavior of Floating-Point NaN values | ||
------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wasn't really about the "FP environment" any more so I moved this all into a new section on NaN values.
Regarding fmin/fmax on x86, @Muon pointed out that LLVM currently simply calls the underlying libm functions to implement these operations. The C standard has nothing to say about the case where both inputs are a NaN. My local manpage says
This is not very specific and could permit arbitrary NaN payloads. I don't know if that actually happens in practice though. So I'm still unsure about whether we need to say that fmin/fmax on x86 can produce arbitrary NaN payloads. LLVM assumes things about I feel like for now it is better to not make a guarantee that we aren't sure can be held up. Though if LLVM just uses libm, couldn't this cause other NaNs even on non-x86? |
68a32c0
to
7029109
Compare
llvm/docs/LangRef.rst
Outdated
there is no floating-point operation being performed, the NaN payload (if any) | ||
is preserved. | ||
|
||
When a floating-point math operation produces a NaN value, the result has a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still add "unless otherwise specified", because you have cases like llvm.canonicalize
which is an FP operation that can arbitrarily mutate NaN payloads but requires the output to be qNaN, even if the input is sNaN.
FWIW, I'm trying to implement this new semantics in Alive2. I'll report back the changes in failed/passed tests later. |
…what this achieves
I've a question regarding fpext & fptrunc: what's the payload we want there? We can't just copy the inputs. |
Ok, I did the easy thing, which was to ignore the rules about copying the NaN payload from the input (in Alive2). They are: ; Transforms/InstCombine/maximum.ll
define float @reduce_precision(float %x, float %y) {
%x.ext = fpext float %x to double
%y.ext = fpext float %y to double
%maximum = fmaximum double %x.ext, %y.ext
%trunc = fptrunc double %maximum to float
ret float %trunc
}
=>
define float @reduce_precision(float %x, float %y) {
%maximum1 = fmaximum float %x, %y
ret float %maximum1
}
Transformation doesn't verify!
ERROR: Value mismatch
Example:
float %x = #x00000000 (+0.0)
float %y = #x7f808000 (SNaN)
Source:
double %x.ext = #x0000000000000000 (+0.0)
double %y.ext = #x7ff8000000000000 (QNaN)
double %maximum = #x7ff8000000000000 (QNaN)
float %trunc = #x7fc00000 (QNaN)
Target:
float %maximum1 = #x7f808000 (SNaN)
Source value: #x7fc00000 (QNaN)
Target value: #x7f808000 (SNaN) ; Transforms/InstCombine/select-binop-foldable-floating-point.ll
define float @select_fsub(i1 %cond, float %A, float %B) {
%C = fsub float %A, %B
%D = select i1 %cond, float %C, float %A
ret float %D
}
=>
define float @select_fsub(i1 %cond, float %A, float %B) {
%C = select i1 %cond, float %B, float 0.000000
%D = fsub float %A, %C
ret float %D
}
Transformation doesn't verify! (unsound)
ERROR: Target's return value is more undefined
Example:
i1 %cond = #x0 (0)
float %A = #x7f804000 (SNaN)
float %B = poison
Source:
float %C = poison
float %D = #x7f804000 (SNaN)
Target:
float %C = #x00000000 (+0.0)
float %D = #x7fd00000 (QNaN)
Source value: #x7f804000 (SNaN)
Target value: #x7fd00000 (QNaN) ; Transforms/InstCombine/fpcast.ll
define half @test4-fast(float %a) {
%b = fsub fast float -0.000000, %a
%c = fptrunc float %b to half
ret half %c
}
=>
define half @test4-fast(float %a) {
%1 = fptrunc float %a to half
%c = fneg fast half %1
ret half %c
}
Transformation doesn't verify!
ERROR: Target is more poisonous than source
Example:
float %a = #xd8804000 (-1128098930098176)
Source:
float %b = #x7fc00000 (QNaN)
half %c = #x7e00 (QNaN)
Target:
half %1 = #xfc00 (-oo)
half %c = poison
Source value: #x7e00 (QNaN)
Target value: poison ; Transforms/InstCombine/fpextend.ll
define float @test4(float %x) {
%t1 = fpext float %x to double
%t2 = fsub double -0.000000, %t1
%t34 = fptrunc double %t2 to float
ret float %t34
}
=>
define float @test4(float %x) {
%t34 = fneg float %x
ret float %t34
}
Transformation doesn't verify! (unsound)
ERROR: Value mismatch
Example:
float %x = #xff829ffe (SNaN)
Source:
double %t1 = #x7ff8000000000000 (QNaN)
double %t2 = #x7ff8000000000000 (QNaN)
float %t34 = #x7fc00000 (QNaN)
Target:
float %t34 = #x7f829ffe (SNaN)
Source value: #x7fc00000 (QNaN)
Target value: #x7f829ffe (SNaN) |
We need a fix for these instructions, so that SNaN can passthrough. |
The natural way to implement this is to designate certain bits to chop, but I don't know what the full suite of implementations here is. In x86, the afflicted bits are the LSB, so |
I've implemented what you suggested and the 4 cases above now verify. |
I'm pretty sure the significand lives in the least significant bits on every single architecture. For format conversions I'm pretty sure everyone just truncates/zero extends. EDIT: I mean only for NaN handling. |
IEEE754 specifies fp->fp format conversions should use the programmer-chosen rounding mode, for LLVM for f32 <-> f64 that's always round to nearest, ties to even |
Ah, I meant for NaN handling. |
So for fpext/fptruncate, what does this mean? Should we add something like this?
(I wasn't sure how to best say "zero-extend on the right", since normal zero-extension happens on the left.)
Or is that not correct and the payload may be reset to the preferred payload as well? |
IIRC RISC-V converts NaN payloads to the canonical qNaN when converting between fp formats. |
So, like this?
Though maybe it's easier to just say
Or even
|
The section "What happens to the payload when the precision is converted?" in This 2018 paper claimed that all known implementations preserve the upper NaN bits when truncating, and append zeros when extending. Of course, that's only the case for the ISAs which propagate NaN payloads at all (so, not RISC-V and not WebAssembly). On RISCV, the sign is always positive and the NaN canonical. On Wasm, the sign is non-deterministic, and the payload follows Wasm's usual "nondeterministic result if given non-canonical input NaNs, canonical NaN with nondeterministic sign otherwise" rule. Given that these are the exact same issues that arise for all other FP instructions, I think these instructions should use our normal rules, just clarified to say how the bits are copied if they happen to be copied. Thus, I suggest the spec should say something like this for
And for
|
I have adjusted this a bit to clarify what is mean by "if a NaN payload is copied from the input", and pushed that to the PR. |
@nikic @jcranmer-intel @efriedma-quic what are the next steps towards landing this PR? |
It's approved; just click the button "squash and merge" (or ask us to do it for you) 🚀 |
A bunch of changes happened since it got approved though. I'm not familiar with how LLVM handles that situation. Also I don't have that button. :) |
If you think the changes are significant enough to require another review, please point out the specific parts. Otherwise, we can consider the patch ready.
From the developer policy: "If you do not have commit access, please let people know during the review and someone should commit it on your behalf." |
Before we merge, though: can you rewrite the commit description (by editing the title&first comment) to accurately describe the current situation? Thx! |
I find it hard to judge whether the reviewers would consider the changes significant enough. I am surprised that as an outsider to the project, I am expected to make such a call. I was about to produce a summary of the changes since the approval, but thankfully @jyknight was faster. :)
I've updated the PR description. The PR title still seems fine? Should I squash the commits? |
Thanks. I tweaked the title slightly.
No, you shouldn't -- squash happens during the merge (via the button labeled "Squash and Merge"). I will press said button now. |
Great, thanks a lot. :) |
Follow-up to #66579: while implementing those semantics in Miri I realized there's a special case to be considered in truncating float casts.
The current docs don't say anything about the possible set of values returned by a NaN-producing operation. However, there are some coding patterns where such guarantees are strongly desired, such as NaN boxing (as used e.g. in the Spidermonkey JS engine). And in fact the set of possible payloads that can be observed today when compiling code via LLVM is fairly limited (based on what optimizations and backends and hardware currently do) -- but without a documented guarantee, languages compiling to LLVM cannot really rely on this.
There was a long discussion about the exact shape that the guarantees should take on Discourse. This PR transcribes what I perceived as the consensus, plus some further clarifications: "unless specified otherwise", returned NaNs have a non-deterministic sign, and the payload is generally either the preferred (all-zero) payload or propagates the payload of one of the inputs.