Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow floating-point operations to provide extra precision than specified, as an optimization #2686

Closed
192 changes: 192 additions & 0 deletions text/0000-floating-point-additional-precision.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
- Feature Name: `allow_extra_fp_precision`
- Start Date: 2019-04-08
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

Update the Rust specification to allow floating-point operations to provide
*more* precision than specified, but not less precision; this allows many safe
optimizations. Specify robust mechanisms to disable this behavior.

# Motivation
[motivation]: #motivation

Some platforms provide instructions to run a series of floating-point
operations quickly, such as fused multiply-add instructions; using these
instructions can provide performance wins up to 2x or more. These instructions
may provide *more* precision than required by IEEE floating-point operations,
such as by doing multiple operations before rounding or losing precision.
Similarly, high-performance floating-point code could perform multiple
operations with higher-precision floating-point registers before converting
back to a lower-precision format.

In general, providing more precision than required will only bring a
calculation closer to the mathematically precise answer, never further away.
Copy link
Member

@RalfJung RalfJung Oct 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like an extremely strong statement to me that needs further justification. I see no reason to assume such monotonicity here. Different rounding errors happening during a computation might as well just happen to cancel each other such that removing some errors actually increases the error of the final result.

Extreme example: 1 / 10 has a rounding error, but 1.0/10.0 - 1.0/10.0 actually gives the right result. Providing more precision only on one side of the subtraction increases the error of the entire calculation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


This RFC proposes allowing floating-point types to perform intermediate
calculations using more precision than the type itself, as long as they provide
*at least* as much precision as the IEEE 754 standard requires.

See the [prior art section](#prior-art) for precedent in several other
languages and compilers.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

Floating-point operations in Rust have a guaranteed minimum accuracy, which
specifies how far the result may differ from an infinitely accurate,
mathematically exact answer. The implementation of Rust for any target platform
must provide at least that much accuracy. In some cases, Rust can perform
operations with higher accuracy than required, and doing so provides greater
performance (such as by removing intermediate rounding steps).

A note for users of other languages: this is *not* the equivalent of the "fast
math" option provided by some compilers. Unlike such options, this behavior
will never make any floating-point operation *less* accurate, but it can make
floating-point operations *more* accurate, making the result closer to the
mathematically exact answer.
Comment on lines +45 to +49
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given #2686 (comment), I think this statement should be removed as it is incorrect -- or else there should be an argument for how we plan to guarantee that we never make things less accurate.


joshtriplett marked this conversation as resolved.
Show resolved Hide resolved
Due to differences in hardware, in platform libm implementations, and various
other factors, Rust cannot fully guarantee identical results on all target
platforms. (Doing so on *all* platforms would incur a massive performance
loss.) However, with some additional care, applications desiring cross-platform
identical results can potentially achieve that on multiple target platforms. In
particular, applications prioritizing identical, portable results across two or
more target platforms can disable extra floating-point precision entirely.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in a previous comment, mere reproducibility is not always the reason to disable this behaviour. Some algorithms can actually take advantage of the weird special properties of floating-point arithmetic. Such algorithms should remain implementable as Rust libraries, and those should not break just because someone decided they wanted their unrelated floating-point code to be as fast as possible.


# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

Currently, Rust's [specification for floating-point
types](https://doc.rust-lang.org/reference/types/numeric.html#floating-point-types)
states only that:
> The IEEE 754-2008 "binary32" and "binary64" floating-point types are f32 and f64, respectively.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall this be understood as "the layout of f{32, 64} is that of binary{32, 64}" or as "the layout and arithmetic of f{32, 64} is that of binary{32, 64}" ?

The IEEE-754:2008 standard is very clear that optimizations like replacing a * b + c with fusedMultiplyAdd(a, b, c) should be opt-in, and not opt-out (e.g. see section 10.4), so depending on how one interprets the above, the proposed change could be a backwards incompatible change.


This RFC proposes updating that definition as follows:

The `f32` and `f64` types represent the IEEE 754-2008 "binary32" and "binary64"
floating-point types. Operations on those types must provide at least as much
precision as the IEEE standard requires; such operations may provide *more*
precision than the standard requires, such as by doing a series of operations
with higher precision before storing a value of the desired precision.

rustc should provide a codegen (`-C`) option to disable this behavior, such as
`-C extra-fp-precision=off`; compiling with this option will disable extra
precision in all crates compiled into an application. (Cargo should provide a
means of specifying this option.) Rust should also provide attributes to
disable this behavior from within code, such as `#[extra_fp_precision(off)]`;
this attribute will disable extra precision within the module or function it is
applied to. On platforms that do not currently implement disabling extra
precision, the codegen option and attribute should produce an error (not a
warning), to avoid surprises.

In addition, because this change makes extra floating-point precision visible
on more platforms, the Rust release notes, documentation, and similar channels
should explicitly discuss the issue of extra floating-point precision and how
to disable it. Furthermore, this change should not become part of a stable Rust
release until at least eight stable releases *after* it first becomes
implemented in the nightly compiler.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the point of this last sentence. And particularly, why is the reference point the first availability in nightly? I think it would be more useful to guarantee that the optimisations will not be enabled by default on stable until the opt-out has been available as a no-op for a few stable releases.


# Drawbacks
[drawbacks]: #drawbacks

If Rust already provided bit-for-bit identical floating-point computations
across platforms, then this change could potentially allow floating-point
computations to differ (in the amount of additional accuracy beyond the
standards requirements) by platform, enabled target features (e.g. instruction
sets), or optimization level.

However, standards-compliant implementations of operations on floating-point
values can and do *already* vary slightly by platform, sufficiently so to
produce different binary results; in particular, floating-point operations in
Rust can already produce more precise results depending on target platform,
optimization level, the target's libm library, and the version of the target
libm. As with that existing behavior, this proposal can never make results
*less* accurate, it can only make results *more* accurate. Nonetheless, this
change potentially introduces such variations on target platforms that did not
previously have them.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

For the attribute and codegen option, we could allow code to opt in via
attribute even if disabled via codegen, and then provide a `force-off` codegen
option to override that. This would have two serious downsides, however: it
would propagate the perception of extra floating-point precision as an unsafe
optimization that requires opting into, and it would make life more difficult
for people who wish to opt out of this behavior and attempt to achieve
identical results on multiple target platforms. This RFC recommends the simpler
approach of not providing an enablement option via attribute, such that the
codegen option always force-disables extra precision everywhere.

We could provide an option to enable extra accuracy for the default
floating-point types, but disable it by default. This would leave the majority
of floating-point code unable to use these optimizations, however; defaults
matter, and the majority of code seems likely to use the defaults. In addition,
permitting extra floating-point precision by default would match the existing
behavior of Rust, and would allow the Rust compiler to assume that code
explicitly disabling extra precision has a specific requirement to do so and
depends on that behavior. Nonetheless, this alternative would still provide the
option to produce more optimized code, making it preferable over doing nothing.
This alternative would necessitate respecifying the codegen option and
attribute to support enabling it, as well as having a force-off codegen option
to override enablement via the attribute.

We could provide a separate set of types and allow extra accuracy in their
operations; however, this would create API incompatibilities between
floating-point functions, and the longer, less-well-known types seem unlikely
to see widespread use. Furthermore, allowing or disallowing extra accuracy
seems more closely a property of the calculation than a property of the type.

We could provide additional methods for floating-point operations that allow
passing additional flags, including floating-point contraction. The compiler
could then fuse and otherwise optimize such operations. However, this would
make optimized floating-point code *substantially* less ergonomic, due to the
inability to use operators. To enable operators, we could additionally
implement wrapper types, as above, with the same upsides and downsides.

We could do nothing, and require code to use `a.mul_add(b, c)` for
optimization; however, this would not allow for similar future optimizations,
and would not allow code to easily enable this optimization without substantial
code changes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could provide a clippy lint that recognizes a * b + c (and many others), and tell people that if they don't care about precision, they can write a.mul_add(b, c) instead. We could have a group of clippy lints about these kind of things that people can enable in bulk.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this particular point a clippy lint is helpful but not necessarily enough. Once the optimizer chews through layers of code it can end up at an a * b + c expression without it being anything that is obvious to clippy.


We could narrow the scope of optimization opportunities to *only* include
floating-point contraction but not any other precision-increasing operations.
See the [future possibilities](#future-possibilities) section for further
discussion on this point.

# Prior art
[prior-art]: #prior-art

This has precedent in several other languages and compilers:

- [C11](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) allows
extra floating-point precision with the `STDC FP_CONTRACT` pragma enabled,
and the default state of that pragma is implementation-defined. GCC, ICC,
MSVC, and some other C compilers enable this behavior by default; Clang
disables it by default, though some downstream users of Clang re-enable it
system-wide.

- [The C++ standard](http://eel.is/c++draft/expr.pre#6) states that "The
values of the floating operands and the results of floating
expressions may be represented in greater precision and range than
that required by the type; the types are not changed thereby."

- The [Fortran standard](https://www.fortran.com/F77_std/rjcnf0001-sh-6.html#sh-6.6.4)
states that "the processor may evaluate any mathematically equivalent
expression", where "Two arithmetic expressions are mathematically
equivalent if, for all possible values of their primaries, their
mathematical values are equal. However, mathematically equivalent
arithmetic expressions may produce different computational results."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with Fortran (or at least this aspect of it), but this quote seems to license far more than contraction, e.g. all sorts of -ffast-math style transformation that ignore the existence of NaNs. Is that right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkruppe That's correct, Fortran also allows things like reassociation and commutation, as long as you never ignore parentheses.


# Future possibilities
[future-possibilities]: #future-possibilities

The initial implementation of this RFC can simply enable floating-point
contraction within LLVM (and equivalent options in future codegen backends).
However, this RFC also allows other precision-increasing optimizations; in
particular, this RFC would allow the implementation of f32 or future f16
formats using higher-precision registers, without having to apply rounding
after each operation.