-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow floating-point operations to provide extra precision than specified, as an optimization #2686
Changes from all commits
c911f31
7ddfbcd
f9fffdf
047dea6
ce7d876
349c711
d78aa75
57e4331
b699bc5
84734aa
9db70f2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,192 @@ | ||
- Feature Name: `allow_extra_fp_precision` | ||
- Start Date: 2019-04-08 | ||
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) | ||
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Update the Rust specification to allow floating-point operations to provide | ||
*more* precision than specified, but not less precision; this allows many safe | ||
optimizations. Specify robust mechanisms to disable this behavior. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
Some platforms provide instructions to run a series of floating-point | ||
operations quickly, such as fused multiply-add instructions; using these | ||
instructions can provide performance wins up to 2x or more. These instructions | ||
may provide *more* precision than required by IEEE floating-point operations, | ||
such as by doing multiple operations before rounding or losing precision. | ||
Similarly, high-performance floating-point code could perform multiple | ||
operations with higher-precision floating-point registers before converting | ||
back to a lower-precision format. | ||
|
||
In general, providing more precision than required will only bring a | ||
calculation closer to the mathematically precise answer, never further away. | ||
|
||
This RFC proposes allowing floating-point types to perform intermediate | ||
calculations using more precision than the type itself, as long as they provide | ||
*at least* as much precision as the IEEE 754 standard requires. | ||
|
||
See the [prior art section](#prior-art) for precedent in several other | ||
languages and compilers. | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
Floating-point operations in Rust have a guaranteed minimum accuracy, which | ||
specifies how far the result may differ from an infinitely accurate, | ||
mathematically exact answer. The implementation of Rust for any target platform | ||
must provide at least that much accuracy. In some cases, Rust can perform | ||
operations with higher accuracy than required, and doing so provides greater | ||
performance (such as by removing intermediate rounding steps). | ||
|
||
A note for users of other languages: this is *not* the equivalent of the "fast | ||
math" option provided by some compilers. Unlike such options, this behavior | ||
will never make any floating-point operation *less* accurate, but it can make | ||
floating-point operations *more* accurate, making the result closer to the | ||
mathematically exact answer. | ||
Comment on lines
+45
to
+49
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given #2686 (comment), I think this statement should be removed as it is incorrect -- or else there should be an argument for how we plan to guarantee that we never make things less accurate. |
||
|
||
joshtriplett marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Due to differences in hardware, in platform libm implementations, and various | ||
other factors, Rust cannot fully guarantee identical results on all target | ||
platforms. (Doing so on *all* platforms would incur a massive performance | ||
loss.) However, with some additional care, applications desiring cross-platform | ||
identical results can potentially achieve that on multiple target platforms. In | ||
particular, applications prioritizing identical, portable results across two or | ||
more target platforms can disable extra floating-point precision entirely. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As I mentioned in a previous comment, mere reproducibility is not always the reason to disable this behaviour. Some algorithms can actually take advantage of the |
||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
Currently, Rust's [specification for floating-point | ||
types](https://doc.rust-lang.org/reference/types/numeric.html#floating-point-types) | ||
states only that: | ||
> The IEEE 754-2008 "binary32" and "binary64" floating-point types are f32 and f64, respectively. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall this be understood as "the layout of The IEEE-754:2008 standard is very clear that optimizations like replacing |
||
|
||
This RFC proposes updating that definition as follows: | ||
|
||
The `f32` and `f64` types represent the IEEE 754-2008 "binary32" and "binary64" | ||
floating-point types. Operations on those types must provide at least as much | ||
precision as the IEEE standard requires; such operations may provide *more* | ||
precision than the standard requires, such as by doing a series of operations | ||
with higher precision before storing a value of the desired precision. | ||
|
||
rustc should provide a codegen (`-C`) option to disable this behavior, such as | ||
`-C extra-fp-precision=off`; compiling with this option will disable extra | ||
precision in all crates compiled into an application. (Cargo should provide a | ||
means of specifying this option.) Rust should also provide attributes to | ||
disable this behavior from within code, such as `#[extra_fp_precision(off)]`; | ||
this attribute will disable extra precision within the module or function it is | ||
applied to. On platforms that do not currently implement disabling extra | ||
precision, the codegen option and attribute should produce an error (not a | ||
warning), to avoid surprises. | ||
|
||
In addition, because this change makes extra floating-point precision visible | ||
on more platforms, the Rust release notes, documentation, and similar channels | ||
should explicitly discuss the issue of extra floating-point precision and how | ||
to disable it. Furthermore, this change should not become part of a stable Rust | ||
release until at least eight stable releases *after* it first becomes | ||
implemented in the nightly compiler. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure I understand the point of this last sentence. And particularly, why is the reference point the first availability in nightly? I think it would be more useful to guarantee that the optimisations will not be enabled by default on stable until the opt-out has been available as a no-op for a few stable releases. |
||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
If Rust already provided bit-for-bit identical floating-point computations | ||
across platforms, then this change could potentially allow floating-point | ||
computations to differ (in the amount of additional accuracy beyond the | ||
standards requirements) by platform, enabled target features (e.g. instruction | ||
sets), or optimization level. | ||
|
||
However, standards-compliant implementations of operations on floating-point | ||
values can and do *already* vary slightly by platform, sufficiently so to | ||
produce different binary results; in particular, floating-point operations in | ||
Rust can already produce more precise results depending on target platform, | ||
optimization level, the target's libm library, and the version of the target | ||
libm. As with that existing behavior, this proposal can never make results | ||
*less* accurate, it can only make results *more* accurate. Nonetheless, this | ||
change potentially introduces such variations on target platforms that did not | ||
previously have them. | ||
|
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
For the attribute and codegen option, we could allow code to opt in via | ||
attribute even if disabled via codegen, and then provide a `force-off` codegen | ||
option to override that. This would have two serious downsides, however: it | ||
would propagate the perception of extra floating-point precision as an unsafe | ||
optimization that requires opting into, and it would make life more difficult | ||
for people who wish to opt out of this behavior and attempt to achieve | ||
identical results on multiple target platforms. This RFC recommends the simpler | ||
approach of not providing an enablement option via attribute, such that the | ||
codegen option always force-disables extra precision everywhere. | ||
|
||
We could provide an option to enable extra accuracy for the default | ||
floating-point types, but disable it by default. This would leave the majority | ||
of floating-point code unable to use these optimizations, however; defaults | ||
matter, and the majority of code seems likely to use the defaults. In addition, | ||
permitting extra floating-point precision by default would match the existing | ||
behavior of Rust, and would allow the Rust compiler to assume that code | ||
explicitly disabling extra precision has a specific requirement to do so and | ||
depends on that behavior. Nonetheless, this alternative would still provide the | ||
option to produce more optimized code, making it preferable over doing nothing. | ||
This alternative would necessitate respecifying the codegen option and | ||
attribute to support enabling it, as well as having a force-off codegen option | ||
to override enablement via the attribute. | ||
|
||
We could provide a separate set of types and allow extra accuracy in their | ||
operations; however, this would create API incompatibilities between | ||
floating-point functions, and the longer, less-well-known types seem unlikely | ||
to see widespread use. Furthermore, allowing or disallowing extra accuracy | ||
seems more closely a property of the calculation than a property of the type. | ||
|
||
We could provide additional methods for floating-point operations that allow | ||
passing additional flags, including floating-point contraction. The compiler | ||
could then fuse and otherwise optimize such operations. However, this would | ||
make optimized floating-point code *substantially* less ergonomic, due to the | ||
inability to use operators. To enable operators, we could additionally | ||
implement wrapper types, as above, with the same upsides and downsides. | ||
|
||
We could do nothing, and require code to use `a.mul_add(b, c)` for | ||
optimization; however, this would not allow for similar future optimizations, | ||
and would not allow code to easily enable this optimization without substantial | ||
code changes. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could provide a clippy lint that recognizes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On this particular point a clippy lint is helpful but not necessarily enough. Once the optimizer chews through layers of code it can end up at an |
||
|
||
We could narrow the scope of optimization opportunities to *only* include | ||
floating-point contraction but not any other precision-increasing operations. | ||
See the [future possibilities](#future-possibilities) section for further | ||
discussion on this point. | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
This has precedent in several other languages and compilers: | ||
|
||
- [C11](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) allows | ||
extra floating-point precision with the `STDC FP_CONTRACT` pragma enabled, | ||
and the default state of that pragma is implementation-defined. GCC, ICC, | ||
MSVC, and some other C compilers enable this behavior by default; Clang | ||
disables it by default, though some downstream users of Clang re-enable it | ||
system-wide. | ||
|
||
- [The C++ standard](http://eel.is/c++draft/expr.pre#6) states that "The | ||
values of the floating operands and the results of floating | ||
expressions may be represented in greater precision and range than | ||
that required by the type; the types are not changed thereby." | ||
|
||
- The [Fortran standard](https://www.fortran.com/F77_std/rjcnf0001-sh-6.html#sh-6.6.4) | ||
states that "the processor may evaluate any mathematically equivalent | ||
expression", where "Two arithmetic expressions are mathematically | ||
equivalent if, for all possible values of their primaries, their | ||
mathematical values are equal. However, mathematically equivalent | ||
arithmetic expressions may produce different computational results." | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not familiar with Fortran (or at least this aspect of it), but this quote seems to license far more than contraction, e.g. all sorts of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rkruppe That's correct, Fortran also allows things like reassociation and commutation, as long as you never ignore parentheses. |
||
|
||
# Future possibilities | ||
[future-possibilities]: #future-possibilities | ||
|
||
The initial implementation of this RFC can simply enable floating-point | ||
contraction within LLVM (and equivalent options in future codegen backends). | ||
However, this RFC also allows other precision-increasing optimizations; in | ||
particular, this RFC would allow the implementation of f32 or future f16 | ||
formats using higher-precision registers, without having to apply rounding | ||
after each operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like an extremely strong statement to me that needs further justification. I see no reason to assume such monotonicity here. Different rounding errors happening during a computation might as well just happen to cancel each other such that removing some errors actually increases the error of the final result.
Extreme example:
1 / 10
has a rounding error, but1.0/10.0 - 1.0/10.0
actually gives the right result. Providing more precision only on one side of the subtraction increases the error of the entire calculation.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #2686 (comment)