Allow floating-point operations to provide extra precision than specified, as an optimization #2686

joshtriplett · 2019-04-17T19:25:31Z

This enables optimizations such as fused multiply-add operations by default, while providing robust mechanisms to disable extra precision for applications that wish to do so.

EDIT: Please note that this RFC has been substantially overhauled to better accommodate applications that wish to disable extra precision. In particular, there's a top-level codegen option (-C extra-fp-precision=off) to disable this program-wide.

…fied, as an optimization This enables optimizations such as fused multiply-add operations by default.

joshtriplett · 2019-04-17T19:25:59Z

cc @fenrus75

hanna-kruppe

I really want Rust to have a good story for licensing floating point optimizations, including but not limited to contraction. However, simply turning on contraction by default is not a good step in that direction. Contrary to what the RFC claims, contraction is not "safe" (meaning that it breaks otherwise-working programs; obviously there's no memory safety at stake), and we have not previously reserved the right to do this or given any other indication to users that it might happen.

Let's design a way to opt into and out of this behavior at crate/module/function first, and once that's done we can look at how to make more code use it automatically. A fine-grained opt-in and -out is very useful even if we end up changing the default e.g., to ensure code that breaks under contraction can be compiled as part of a crate graph that generally has contraction enabled. There's plenty of design work to keep us busy even without touching defaults:

compiler options or attributes or ...?
how does it propagate from callers into callees, if at all? (generally hard problem, but IMO a good story for this is just as valuable as providing the basic feature in the first place)
what transformations are licensed exactly? (e.g., do we want roughly what the C standard allows, or do we want more like GCC does?)

hanna-kruppe · 2019-04-17T20:25:23Z

text/0000-floating-point-additional-precision.md

+back to a lower-precision format.
+
+In general, providing more precision than required should not cause a
+mathematical algorithm to fail or to lose numeric accuracy.


This is incorrect. One simple counter-example is x * x - y * y, which is non-negative for all x and y whose squares are finite floats, but if the expression is contracted to x.mul_add(x, - y * y) then it can have negative results. This can of course snowball into even worse issues downstream, e.g., if this is fed into sqrt() to get the 2D euclidean norm, contraction can cause you to end up with NaNs on perfectly innocuous vectors.

Any programs that have a problem with that will need to pass non-default compiler options on many common C, C++, and Fortran compilers.

That said, I'll adjust the language.

Any programs that have a problem with that will need to pass non-default compiler options on many common C, C++, and Fortran compilers.

Some C, C++, and Fortran compilers do this (gcc, msvc), some don't (clang). If this were an universally good idea, all of them would do this, but this is not the case. That is, those languages are prior art, but I'm really missing from the prior art section why this would actually be a good idea - are programmers using those languages happy with that "feature" ?

A sign change trickling down your application depending on the optimization level (or even debug-information level) can be extremely hard to debug in practice. So IMO this issue raised by @rkruppe deserves more analysis than a language adjustment.

why this would actually be a good idea
are programmers using those languages happy with that "feature"

The beginning of the RFC already makes the rationale quite clear: this allows for optimizations on the scale of 2x performance improvements, while never reducing the accuracy of a calculation compared to the mathematically accurate result.

@rkruppe Looking again at your example, I think there's something missing from it? You said:

One simple counter-example is x * x - y * y, which is non-negative for all x and y whose squares are finite floats

Counter-example to that: x = 2.0, y = 4.0. Both x and y square to finite floats, and x*x - y*y should absolutely be negative. I don't think those properties alone are enough to reasonably expect that you can call sqrt on that and get a non-imaginary result.

Ugh, sorry, you're right. That's what I get for repeating the argument from memory and filling the gaps without thinking too long. In general of course x² may be smaller than y². The problematic case is only when x = y (+ aforementioned side conditions), in that case (x * x) - (y * y) is zero but with FMA it can be negative.

Another example, I am told, is complex multiplication when multiplying a number by its conjugate. I will not elaborate because apparently I cannot be trusted this late in the evening to work out the details correctly.

This is incorrect. One simple counter-example is x * x - y * y, which is non-negative for all x and y whose squares are finite floats, but if the expression is contracted to x.mul_add(x, - y * y) then it can have negative results. This can of course snowball into even worse issues downstream, e.g., if this is fed into sqrt() to get the 2D euclidean norm, contraction can cause you to end up with NaNs on perfectly innocuous vectors.

I suspect this is not a valid statement.

the original is in pseudocode

round64( round64(x * x) - round64(y * y) )

the contraction you gives

round64( x * x - round64(y * y) )

the case for this to go negative only in the contraction case would require the
round64(x * x) to round up to >= round64(y * y) while x * x itself is < round64(y * y),
so round64(x * x) == round64(y * y) by the "nearest" element of rounding; it can't cross round64(y * y)

since we're rounding to nearest, it means that x * x is equal to less than half a unit of precision away from round64(y * y).
this in turn means that x * x - round64(y * y) is, while negative in this case, less than half a unit of precision way from 0, which means the outer round64() will round up to 0.

This is incorrect. One simple counter-example is x * x - y * y, which is non-negative for all x and y whose squares are finite floats, but if the expression is contracted to x.mul_add(x, - y * y) then it can have negative results. This can of course snowball into even worse issues downstream, e.g., if this is fed into sqrt() to get the 2D euclidean norm, contraction can cause you to end up with NaNs on perfectly innocuous vectors.

I suspect this is not a valid statement.

the original is in pseudocode

round64( round64(x * x) - round64(y * y) )

the contraction you gives

round64( x * x - round64(y * y) )

If you use y=x, then if round64(x*x) rounds up, it's easy to see that round64(x*x - round64(x*x)) is negative. This does not round to zero, because units of precision are not absolute, but relative (think significant figures in scientific notation).

For reference (and more interesting floating point information!) see the "fmadd" section on https://randomascii.wordpress.com/2013/07/16/floating-point-determinism/

So the conclusion, if I read this correctly, is that indeed increasing precision locally in some sub-computations can reduce precision of the overall computation, right? (Also see here.)

hanna-kruppe · 2019-04-17T20:32:22Z

text/0000-floating-point-additional-precision.md

+across platforms, this change could potentially allow floating-point
+computations to differ by platform (though never below the standards-required
+accuracy). However, standards-compliant implementations of math functions on
+floating-point values may already vary slightly by platform, sufficiently so to


I'm the last person to argue we have any sort of bit-for-bit reproducibility of floating point calculations across platforms or even optimization levels (I know in regretable detail many of the reasons why not), but it seems like a notable further step further to make even the basic arithmetic operations dependent on the optimization level, even for normal inputs, even on the (numerous) targets where they are currently not.

hanna-kruppe · 2019-04-17T20:47:49Z

text/0000-floating-point-additional-precision.md

+- [C11](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) allows
+  this with the `STDC FP_CONTRACT` pragma enabled, and the default state
+  of that pragma is implementation-defined. GCC enables this pragma by
+  default, [as does the Microsoft C


Note that GCC defaults to -ffp-contract=fast, which goes beyond what's described in the C standard, and according to documentation the only other option it implements is off.

Based on some careful research, as far as I can tell GCC's -ffp-contract=fast just changes the default value of STDC FP_CONTRACT, nothing else. It does not enable any of the potentially accuracy-reducing "fast-math" optimizations.

(-ffp-contract=off means "ignore the pragma", and -ffp-contract=on means "don't ignore the pragma" but doesn't change the default.)

My understanding is: the C standard only allows FMA synthesis within a source-level expression. This is extremely inconvenient to respect at the IR level (you'd have to track which source level expression each operation comes from), so -ffp-contract=fast simply disregards source level information and just contracts IR operations if they're of the suitable form.

Clang implements this option too, but it defaults to standard compliance by performing contraction in the frontend where source level boundaries are still available.

hanna-kruppe · 2019-04-17T20:50:17Z

text/0000-floating-point-additional-precision.md

+  expression", where "Two arithmetic expressions are mathematically
+  equivalent if, for all possible values of their primaries, their
+  mathematical values are equal. However, mathematically equivalent
+  arithmetic expressions may produce different computational results."


I'm not familiar with Fortran (or at least this aspect of it), but this quote seems to license far more than contraction, e.g. all sorts of -ffast-math style transformation that ignore the existence of NaNs. Is that right?

@rkruppe That's correct, Fortran also allows things like reassociation and commutation, as long as you never ignore parentheses.

joshtriplett · 2019-04-17T23:09:31Z

@rkruppe wrote:

I really want Rust to have a good story for licensing floating point optimizations, including but not limited to contraction. However, simply turning on contraction by default is not a good step in that direction.

It'd be a step towards parity with other languages, rather than intentionally being slower. I think we need to seriously evaluate whether we're buying anything by intentionally being slower. (And by "slower" here, I don't mean a few percent, I mean 2x slower.)

Contrary to what the RFC claims, contraction is not "safe" (meaning that it breaks otherwise-working programs; obviously there's no memory safety at stake),

Any such programs would be broken in C, C++, Fortran, and likely other languages by default; they'd have to explicitly disable the default behavior. Such programs are also going directly against best practices in numerical methods; if anything, we should ideally be linting against code like (x*x - y*y).sqrt().

and we have not previously reserved the right to do this or given any other indication to users that it might happen.

I've also found no explicit indications that we can't do this. And I've seen no indications that people expect Rust's default behavior to be different than the default behavior of other languages in this regard. What concrete problem are we trying to solve that outweighs a 2x performance win?

A fine-grained opt-in and -out is very useful even if we end up changing the default

Agreed. The RFC already proposes an attribute; I could expand that to provide an attribute with two possible values.

There's plenty of design work to keep us busy even without touching defaults:

If we have any hope of changing the defaults, the time to do that would be before those defaults are relied on.

compiler options or attributes or ...?

I think it makes sense to have a global compiler codegen option, and I also think it makes sense to have an attribute (with a yes/no) that can be applied to any amount of code.

how does it propagate from callers into callees, if at all? (generally hard problem, but IMO a good story for this is just as valuable as providing the basic feature in the first place)

The attribute shouldn't. It should only affect code generation under the scope of the attribute.

what transformations are licensed exactly? (e.g., do we want roughly what the C standard allows, or do we want more like GCC does?)

My ideal goal would be "anything that strictly increases accuracy, making the result closer to the mathematically accurate answer". That would also include, for instance, doing f32 math in f64 registers and not forcing the result to f32 after each operation, if that'd be faster.

ExpHP · 2019-04-18T13:27:59Z

Such programs are also going directly against best practices in numerical methods; if anything, we should ideally be linting against code like (x*x - y*y).sqrt().

In favor of what?

fenrus75 · 2019-04-18T13:48:18Z

in favor of something that ensures not negative... like fabs or an if statement

…

On Thu, Apr 18, 2019, 06:28 Michael Lamparski ***@***.***> wrote: Such programs are also going directly against best practices in numerical methods; if anything, we should ideally be linting against code like (x*x - y*y).sqrt(). In favor of what? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2686 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAJ54FLHBJLJSDWMNKU7LRDPRBZO3ANCNFSM4HGXEURQ> .

joshtriplett · 2019-04-18T17:43:03Z

> Any programs that have a problem with that will need to pass non-default compiler options on many common C, C++, and Fortran compilers. _Some_ C, C++, and Fortran compilers do this (gcc, msvc), some don't (clang). If this were an universally good idea, all of them would do this, but this is not the case.

The point is that the language specification allows it, and many popular implementations of the languages do it, and authors of code in the language would have to explicitly disable that if they want their code to not be affected by it. It isn't a default expectation that this *can't* happen, and that makes it effectively an opt-out rather than an opt-in.

text/0000-floating-point-additional-precision.md

gnzlbg · 2019-04-18T18:48:15Z

text/0000-floating-point-additional-precision.md

+Currently, Rust's [specification for floating-point
+types](https://doc.rust-lang.org/reference/types/numeric.html#floating-point-types)
+states that:
+> The IEEE 754-2008 "binary32" and "binary64" floating-point types are f32 and f64, respectively.


Shall this be understood as "the layout of f{32, 64} is that of binary{32, 64}" or as "the layout and arithmetic of f{32, 64} is that of binary{32, 64}" ?

The IEEE-754:2008 standard is very clear that optimizations like replacing a * b + c with fusedMultiplyAdd(a, b, c) should be opt-in, and not opt-out (e.g. see section 10.4), so depending on how one interprets the above, the proposed change could be a backwards incompatible change.

gnzlbg · 2019-04-18T18:51:22Z

text/0000-floating-point-additional-precision.md

+computations to differ by platform (though never below the standards-required
+accuracy). However, standards-compliant implementations of math functions on
+floating-point values may already vary slightly by platform, sufficiently so to
+produce different binary results. This proposal can never make results *less*


If the intention of the user was for its Rust programs to actually have the semantics of the code it actually wrote, e.g., first do a a * b, and then add the result to c, performing intermediate rounding according the precision of the type, this proposal does not only make the result less accurate, but it makes it impossible to actually even express that operation in the Rust language.

If the user wants higher precision they can write fma(a, b, c) today, and if the user does not care, they can write fmul_add(a, b, c). This proposal, as presented, does not provide a first_mul_a_b_then_add_c(a, b, c) intrinsic that replaces the current semantics, so the current semantics become impossible to write.

performing intermediate rounding, according the precision of the type

What we're discussing in this RFC is, precisely, 1) whether that's actually the definition of the Rust language, and 2) whether it should be. Meanwhile, I'm not seeing any indication that that's actually the behavior Rust developers expect to get, or that they expect to pay 2x performance by default to get it.

but it makes it impossible to actually even express that operation in the Rust language

I'm already editing the RFC to require (rather than suggest) an attribute for this.

gnzlbg · 2019-04-18T18:54:08Z

text/0000-floating-point-additional-precision.md

+
+We could provide a separate set of types and allow extra accuracy in their
+operations; however, this would create ABI differences between floating-point
+functions, and the longer, less-well-known types seem unlikely to see


Not necessarily, these wrappers could be repr(transparent).

I mean this in the sense that changing from one to the other would be an incompatible API change in a crate. I'll clarify that.

If the algorithm does not care about contraction, it might also not care about NaNs, or associativity, or denormals, or ... so if it wants to accept a NonNaN<Associative<NoDenormals<fXY>>> type as well as the primitive f{32, 64} types, then it has to be generic, and if its generic, it would also accept a type wrapper lifting the assumption that contraction is not ok without breaking the API.

In other words, once one starts walking down the road of lifting assumptions about floating-point arithmetic, contraction is just one of the many many different assumptions that one might want to lift. Making it special does not solve the issue of these APIs having to be generic about these.

I do not think we have anywhere near a smooth enough UX for working with wrappers around primitive arithmetic types for me to seriously consider them as a solution for licensing fast-math transformations. There's serious papercuts even when trying to generic over the existing primitive types (e.g., you can't use literals without wrapping them in ugly T::from calls), and we have even less machinery to address the mixing of different types that such wrappers would entail.

I also think it's quite questionable whether these should be properties of the type. It kind of fits "no infinities/nans/etc." but other things are fundamentally about particular operations and therefore may be OK in one code region but not in another code region even if the same data is being operated on.

gnzlbg · 2019-04-18T19:00:59Z

text/0000-floating-point-additional-precision.md

+We could provide a separate set of types and allow extra accuracy in their
+operations; however, this would create ABI differences between floating-point
+functions, and the longer, less-well-known types seem unlikely to see
+widespread use.


Prior art shows that people that need / want this are going to use them, e.g., "less-well-known_ flags like -ffast-math are in widespread use, even though they are not enabled by default. So it is unclear to me how much weight this argument should actually have.

Separate types are harder to drop into a code base than a compiler flag or attribute, though, because using the type in one place generally leads to type errors (and need for conversions to solve them) at the interface with other code.

gnzlbg · 2019-04-18T19:02:53Z

text/0000-floating-point-additional-precision.md

+We could do nothing, and require code to use `a.mul_add(b, c)` for
+optimization; however, this would not allow for similar future optimizations,
+and would not allow code to easily enable this optimization without substantial
+code changes.


We could provide a clippy lint that recognizes a * b + c (and many others), and tell people that if they don't care about precision, they can write a.mul_add(b, c) instead. We could have a group of clippy lints about these kind of things that people can enable in bulk.

On this particular point a clippy lint is helpful but not necessarily enough. Once the optimizer chews through layers of code it can end up at an a * b + c expression without it being anything that is obvious to clippy.

text/0000-floating-point-additional-precision.md

gnzlbg · 2019-04-18T19:26:29Z

Let's design a way to opt into and out of this behavior at crate/module/function first, and once that's done we can look at how to make more code use it automatically.

@rkruppe I would prefer even finer grained control than that, e.g., individual type wrappers that add a single assumption about floating-point math that the compiler is allowed to make and that can be combined, e.g.,

Trapless<T>: whether floating-point arithmetic can be assumed not to trap (e.g. on signaling NaNs)
Round{Nearest,0,+∞,-∞}<T> : whether the rounding mode can be assumed
Associative<T>: whether floating-point arithmetic can be assumed to be associative
Finite<T>: whether floating-point arithmetic can be assumed to produce numbers
Normal<T>: whether floating-point arithmetic can be assumed to produce normal numbers (as opposed to denormals/subnormals)
Contractable<T>: whether intermediate operations can be contracted using higher precision
...

That way I can write a:

pub type Real = Trapless<Finite<Normal<Associative<Constractable<...<f32>...>>>>>>;

and use it throughout the parts of my code where its appropriate. When I need to interface with other crates (or they with me), I can still use f32/f64:

pub fn my_algo(x: f32) -> f32 {
    let r: Real = x.into()
    // ... do stuff with r ... 
    r.into()
}

Sure, some people might go overboard with these, and create complicated trait hierarchies, make all their code generic, etc. but one doesn't really need to do that (if somebody wants to provide a good library to abstract over all of this, similar to how num::Float works today, well they are free to do that, and those who find it useful will use it).

Global flags for turning these on/off require you to inspect the module/function/crate/ .cargo/config / ... to know what the rules for floating-point arithmetic are, and then use that knowledge to reason about your program, and the chances that some code that wasn't intended to play by those rules get those flags applied (e.g. because it was inlined, monomorphized, etc. on a module with those flags enabled), don't seem worth the risk (reading Fortran here gives me fond memories of writing implicit none at the top of every file).

The main argument of this RFC is that if we do something like this, then some code that expends 99% of its execution time doing a * b + c would be 2x slower. If that's the case, submitting a PR to change that code to a.fmul_add(b, c) is a no brainer (been there, done that: https://github.com/rust-lang-nursery/packed_simd/search?q=fma&type=Commits) - changing the behavior of all Rust code to fix such program feels overkill. If the issue is that code that could benefit from such a change is hard to fine, that's what clippy is for.

scottmcm · 2019-04-18T19:33:12Z

Things not being deterministic just came up recently on URLO: https://users.rust-lang.org/t/result-of-f64-cos-is-slightly-different-on-macos-and-linux-in-some-cases/27198/3?u=scottmcm

eaglgenes101 · 2019-04-18T19:34:36Z

In C, even if you make sure your compiler outputs code that uses IEEE 754 floats on all platforms, trying to get the same floating-point results across different platforms, build configurations, and times is an exercise in plugging up a bazillion abstraction leaks. That's par for the course for C. Not for Rust.

I am well aware that floating point is a mere approximation of the real numbers, and that you're suggesting transformations that would increase this accuracy. That said, I still disapprove of the proposed new defaults. I'd much rather not have the compiler try by default to second-guess me on what really should be a perfectly well-defined and predictable operation. I'd much rather the compiler, by default, choose some specific observable output behaviour, and stick to it, just like it normally does. I'll flick the floating point flags myself if I want to sacrifice determinism for a better approximation of what I've given up on since I was a clueless novice looking around for the reason why 0.1 + 0.2 === 0.3 evaluated to false. And I'm pretty sure I'd much rather performance-optimize another clueless programmer's slow floating point code than debug another clueless programmer's heisenbug-laden floating point code.

NaNs may also have unspecified bit patterns. However, IEEE 754 mandates behaviour for NaNs that make them opaque unless you specifically crack them open, and NaNs propagate through most floating-point operations, so if their payload can be disregarded, they are essentially fixed points of floating point operations. Small floating point evaluation differences tend to be magnified by systems with chaotic behaviour, which includes most nontrivial physical systems, and treating finite floats as opaque would completely defeat the purpose of doing the floating point computations in the first place.

joshtriplett · 2019-04-18T19:41:11Z

By way of providing concrete examples that Rust already provides extra accuracy today on some platforms:

$ cat test.rs ; echo === ; rustc +nightly --target=i586-unknown-linux-gnu test.rs -o test32 && rustc +nightly test.rs -o test64 && ./test32 && ./test64
fn foo(num: f32) -> f32 {
    ((num + 0.1) / 1.5e38) * 1.5e38
}

fn main() {
    println!("error: {:.50}", foo(1.23456789) - 1.23456789 - 0.1);
}
===
error: 0.00000002235174179077148437500000000000000000000000
error: 0.00000014156103134155273437500000000000000000000000

i586-unknown-linux-gnu has more accuracy than x86_64-unknown-linux-gnu, because it does intermediate calculations with more precision. And changing that would substantially reduce performance.

joshtriplett · 2019-04-18T19:44:25Z

@gnzlbg What code do you expect the compiler to generate when you use those generics? Because ultimately, if you want that, you're asking for pure software floating-point on many platforms.

gnzlbg · 2019-04-18T19:50:14Z

@joshtriplett

What code do you expect the compiler to generate when you use arbitrary combinations of those types?

If you check the LangRef for the LLVM-IR of the floating point intrinsics, e.g., llvm.fmul for a * b, you see that since recently it looks like:

<result> = fmul [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result

where flags like nonnan, finite, etc. can be inserted in [fast-math flags].

So when one uses such a type, I expect that rustc will insert the fast-math flags for each operation as appropriate. That's more finer grained than just inserting them as function attributes for all functions in an LLVM module.

joshtriplett · 2019-04-18T19:58:24Z

@gnzlbg What machine code do you expect to generate when every operation can potentially have different flags? How much performance do you consider reasonable to sacrifice to get the behavior you're proposing? What specific code do you want to write that depends on having such fine-grained type-level control of this behavior?

Not all abstract machines and specifications translate to reasonable machine code on concrete machines. If you want bit-for-bit identical results for floating point on different platforms and target feature flats and optimization levels, you're going to end up doing software floating point for many operations on many platforms, and I don't think that's going to meet people's expectations at all. If you can live with the current state that we've had for years, then this RFC is already consistent with that behavior.

I would like to request that discussion of adding much more fine-grained control of specific floating-point flags that weren't already raised in the RFC be part of some other RFC, rather than this one. I already have a mention of the idea of adding specific types, which covers the idea of (for instance) Contractable<T>. I don't think the full spectrum of flag-by-flag types mentioned in this comment is in scope for this RFC.

joshtriplett · 2019-04-18T20:11:02Z

Expanding on my earlier comment, Rust also already allows floating-point accuracy to depend on optimization level, in addition to targets:

$ cat test.rs ; echo === ; rustc +nightly --target=i586-unknown-linux-gnu test.rs -o test32 && rustc +nightly --target=i586-unknown-linux-gnu -O test.rs -o test32-opt && rustc +nightly test.rs -o test64 && ./test32 && ./test32-opt && ./test64
fn foo(num: f32) -> f32 {
    ((num + 0.1) / 1.5e38) * 1.5e38
}

fn main() {
    let prog = std::env::args().next().unwrap();
    println!("{:12} error: {:.50}", prog, foo(1.23456789) - 1.23456789 - 0.1);
}
===
./test32     error: 0.00000002235174179077148437500000000000000000000000
./test32-opt error: 0.00000014156103134155273437500000000000000000000000
./test64     error: 0.00000014156103134155273437500000000000000000000000

So, in practice, Rust already has this behavior, and this RFC does not represent a breaking change.

(Worth noting that it's easy enough to reproduce this with f64 as well, just by changing the types and constants.)

scottmcm · 2019-04-18T20:15:26Z

you're going to end up doing software floating point for many operations

For things like cos, yes, but not for ordinary addition.

From http://www.box2d.org/forum/viewtopic.php?f=3&t=1800#p16480:

I work at Gas Powered Games and i can tell you first hand that floating point math is deterministic. You just need the same instruction set and compiler and of course the user's processor adhears to the IEEE754 standard, which includes all of our PC and 360 customers. The engine that runs DemiGod, Supreme Commander 1 and 2 rely upon the IEEE754 standard. Not to mention probably all other RTS peer to peer games in the market. As soon as you have a peer to peer network game where each client broadcasts what command they are doing on what 'tick' number and rely on the client computer to figure out the simulation/physical details your going to rely on the determinism of the floating point processor.

So it's not trivial, but apparently it works across processor vendors and such.

…om other languages "fast math" is widely perceived as an unsafe option to go faster by sacrificing accuracy.

Lokathor · 2019-04-25T00:46:57Z

well, can perhaps we just be real about what we're telling the compiler to allow?

#![fp(allow_fma)]

Or are there things besides just allowing for FMA usage that we're talking about here? (EDIT: in this first wave of optimizations at least)

fenrus75 · 2019-04-25T01:09:33Z

there most certainly are other things;
example would be a system where converting from f64 to f32 is expensive (rounding, range checks etc), and if a calculation is a mix of f32 and f64... this would allow the whole calculation to be done in f64 and the rounding down only at the final store to memory.

(f32->f64 tends to be cheap since it's mostly just padding 0 bits)

programmerjake · 2019-04-25T04:08:53Z

From what I understand, LLVM (and probably Rust) by default assumes that traps don't occur and the rounding mode is set to round-to-nearest

joshtriplett · 2019-04-25T19:42:58Z

On Wed, Apr 24, 2019 at 05:47:09PM -0700, Lokathor wrote: well, can perhaps we just be real about what we're telling the compiler to allow? `#![fp(allow_fma)]` Or are there things _besides_ just allowing for FMA usage that we're talking about here?

Yes, there are other optimizations this would allow besides floating-point contraction, such as performing several operations on `f32` using an `f64` register before putting it back into an `f32` location. And that has exactly the same property as contraction: extra precision. I'd like to describe the semantic concept rather than the specific optimization, since it has the same effect on code.

joshtriplett · 2019-04-25T19:44:39Z

On Wed, Apr 24, 2019 at 05:31:34PM -0700, Kornel wrote: I don't mind that behavior. In fact, I'd like even more reckless-approx-math options. Is there a path to opting in to more fast fp math? Maybe `#[extra_fp_precision(on)]` could be `#[fp(extra_precision)]` and eventually become `#[fp(extra_precision, associative, reciprocal_approx, no_signed_zero, no_traps)]`, etc.

I don't want to add those other flags in this RFC (I *really* want to avoid the implication of association with `-ffast-math`), but I have no objection to changing this to `fp(extra_precision(off))` (or perhaps `fp(no_extra_precision)`), to allow for future `fp(...)` flags. That seems entirely reasonable.

gThorondorsen · 2019-04-26T08:27:27Z

On Wed, Apr 24, 2019 at 05:31:34PM -0700, Kornel wrote:
I don't mind that behavior. In fact, I'd like even more reckless-approx-math options. Is there a path to opting in to more fast fp math? Maybe #[extra_fp_precision(on)] could be #[fp(extra_precision)] and eventually become #[fp(extra_precision, associative, reciprocal_approx, no_signed_zero, no_traps)], etc.

I don't want to add those other flags in this RFC (I really want to avoid the implication of association with -ffast-math), but I have no objection to changing this to fp(extra_precision(off)) (or perhaps fp(no_extra_precision)), to allow for future fp(...) flags. That seems entirely reasonable.

In my opinion, this attribute should really have an option to disable all the optimisations that may change the exact results of the computations, present and future. So that people who care can write e.g. #[fp(strict)] and know their library will not break when new optimisations are introduced. And also allow fp(strict, contraction) or fp(strict, extra_precision) to enable optimisations selectively.

Also, I find fp to be too short and generic of a name. I would expect this attribute to also be able to control the rounding mode and trapping behaviour. It may or may not be a good idea to group all these features into a single attribute. I propose to use fp_optimize instead, and leave the other functionality to other names (probably fp-prefixed as well).

gThorondorsen · 2019-04-26T08:40:55Z

text/0000-floating-point-additional-precision.md

+should explicitly discuss the issue of extra floating-point precision and how
+to disable it. Furthermore, this change should not become part of a stable Rust
+release until at least eight stable releases *after* it first becomes
+implemented in the nightly compiler.


I'm not sure I understand the point of this last sentence. And particularly, why is the reference point the first availability in nightly? I think it would be more useful to guarantee that the optimisations will not be enabled by default on stable until the opt-out has been available as a no-op for a few stable releases.

gThorondorsen · 2019-04-26T09:04:21Z

text/0000-floating-point-additional-precision.md

+loss.) However, with some additional care, applications desiring cross-platform
+identical results can potentially achieve that on multiple target platforms. In
+particular, applications prioritizing identical, portable results across two or
+more target platforms can disable extra floating-point precision entirely.


As I mentioned in a previous comment, mere reproducibility is not always the reason to disable this behaviour. Some algorithms can actually take advantage of the ~~weird~~ special properties of floating-point arithmetic. Such algorithms should remain implementable as Rust libraries, and those should not break just because someone decided they wanted their unrelated floating-point code to be as fast as possible.

RalfJung · 2019-10-17T21:17:55Z

I have some concern with this approach, that I'd at least like to see listed in the "drawbacks".

It is kind of unclear how to make "you can use more precision" formal. After all there are observably just 64bits in an f64, so how can it store more precision than that? This comes up not just for a formalization but also when implementing this spec change in Miri. I do not see a reasonable way for Miri to actually give a higher-precision result for compound Rust expressions; after all these are still sequences of MIR instructions that are individually executed and that only have 64bit of state that can be carried from one step to the next.

We could try to handle floating points similar to pointers, but that would mean that the compiler would have to be very careful about preserving that "provenance" on floating points.
And secondly, what this change does is add a whole lot of non-determinism. Basically, any floating-point operation can now non-deterministically be more precise. Given that unsafe code has to be safe under any legal execution of the program, this means that unsafe code working with floating points has to be extremely careful to be sure that it actually works with any allowed precision. Miri will not be able to help here as it seems unfeasible to actually try every permitted precision for every operation (even assuming we solved the problems raised in the first point). I don't know of any sanitizer or logic that is able to properly handle this kind of non-determinism, so I'd expect that for the foreseeable future, any tool we have to increase our confidence in unsafe code is just going to assume that "higher precision" is never used.

I understand the practical concerns leading here, but from a formal perspective and wanting to make the Rust spec precise, this RFC is a step backwards.

That is not at all a Rust-specific problem; -ffast-math has all the same issues. Some of my colleagues are working on a formal treatment of -ffast-math, but they are using basically symbolic semantics for floating-point expressions; it is entirely unclear how to combine this with things like mutable memory or how to build a problem logic for it or really how to do anything except for compiling it.^^ But at least -ffast-math is off-by-default...

RalfJung · 2019-10-17T21:20:47Z

text/0000-floating-point-additional-precision.md

+back to a lower-precision format.
+
+In general, providing more precision than required will only bring a
+calculation closer to the mathematically precise answer, never further away.


That sounds like an extremely strong statement to me that needs further justification. I see no reason to assume such monotonicity here. Different rounding errors happening during a computation might as well just happen to cancel each other such that removing some errors actually increases the error of the final result.

Extreme example: 1 / 10 has a rounding error, but 1.0/10.0 - 1.0/10.0 actually gives the right result. Providing more precision only on one side of the subtraction increases the error of the entire calculation.

See #2686 (comment)

nestordemeure · 2019-10-24T20:13:54Z

I work on the measure of the numerical error introduced by floating-point arithmetic and I believe this RFC could be an all around improvement.

It is a speed improvement.

It is an accuracy improvement.
While there are operations that might become less accurate, the vast majority of computations will benefit (which is why I believe it should be the default).

It could be a determinism improvement.
As it has been said previously, things are already non-deterministic on some platform (one thing that has not been said is that you will get different result from debug to release if you introduce vectorization).
But adding a flag to enforce strict floating-point manipulation, to deactivate the proposed optimization, could also, in time, force all platform to conform to the norm if the flag is set (improving on the current situation). In short such a flag could remove a form of undefined behavior for user who care about binary reproducibility.

Finally, as others have suggested (and out of the scope of this RFC), I would love to have the ability to locally enforce strict floating-point manipulation. While I believe that binary reproducibility of floating-point result is often misguided, some operations do require absolute control from the user.

RalfJung · 2019-10-24T21:07:44Z

It could be a determinism improvement.

That's a stretch. We could certainly provide binary guarantees for all platforms without introducing non-determinism for all platforms.

As it has been said previously, things are already non-deterministic on some platform (one thing that has not been said is that you will get different result from debug to release if you introduce vectorization).

Some platforms being ill-behaved does not seem like a good argument for introducing ill-behavedness on sane platforms.^^ (Some good arguments have been made in this thread, but this isn't one.)

In short such a flag could remove a form of undefined behavior for user who care about binary reproducibility.

There's no UB here, right? Just non-determinism.

nestordemeure · 2019-10-25T05:25:11Z

For me it is UB in the sense that the code's behavior is not specified and can, thus, vary on different platform in a way that is not predictable by the user.

My argument is not that the current situation is bad and thus it does not matter if we worsen it but that the current situation is unregulated and that this could bring in a flag to improve on the current situation when it matters (by specifying the expected behavior) and let things be when it doesn't (where I believe contraction is a better default).

RalfJung · 2019-10-25T07:32:45Z

text/0000-floating-point-additional-precision.md

+A note for users of other languages: this is *not* the equivalent of the "fast
+math" option provided by some compilers. Unlike such options, this behavior
+will never make any floating-point operation *less* accurate, but it can make
+floating-point operations *more* accurate, making the result closer to the
+mathematically exact answer.


Given #2686 (comment), I think this statement should be removed as it is incorrect -- or else there should be an argument for how we plan to guarantee that we never make things less accurate.

RalfJung · 2019-10-25T07:36:14Z

For me it is UB in the sense that the code's behavior is not specified and can, thus, vary on different platform in a way that is not predictable by the user.

UB is a technical term with a specific meaning, and this is not it. I get what you mean but please let's use terminology correctly, lest it become useless. :)

My argument is not that the current situation is bad and thus it does not matter if we worsen it but that the current situation is unregulated and that this could bring in a flag to improve on the current situation when it matters (by specifying the expected behavior) and let things be when it doesn't (where I believe contraction is a better default).

So I think the argument is that this reduces underspecification for platforms which currently do not faithfully implement IEEE semantics? I agree it does. It also makes those platforms not special exceptions any more. However, it does so by pulling all platforms (in their default config) down to the level of (what I consider to be) "misbehaving" platforms. The proposal is to use the lowest common denominator as the new default. (Please correct me if I misread.) Somehow I cannot see that as progress.

Ultimately this is a question of defaults: I would prefer the default to be IEEE with no exception, and then a way to opt-in to deviations from this strict baseline. These deviations would be in the style of -ffast-math: make stuff go faster, and maybe even become more precise, at the expense of predictability.

There seems to be a spectrum of "IEEE conformance", with full conformance on one end, "whatever C does per default" somewhere in the middle (where it will e.g. use x87 instructions), and full fast-math on the other end. If I read this proposal correctly, it proposes to make the Rust default the same as / close to the C default. But I do not see any good reason for picking this particular spot on the spectrum other than "C did it" (the claim that this never reduces accuracy has been refuted, from what I can tell). So if we ignore C, IMO the most obvious choices for the default are "fully conformant" or "fully fast-math", and the RFC does not do a good enough job arguing for why we should pick another default on some "random" spot in the middle.

gnzlbg · 2019-10-25T08:42:44Z

It could be a determinism improvement.
As it has been said previously, things are already non-deterministic on some platform (one thing that has not been said is that you will get different result from debug to release if you introduce vectorization).

Right now, for a particular Rust toolchain and for many particular target platforms we are very close to having bit-per-bit deterministic results on a wide range of different hardware on that platform.

That is, if a user of your program hits a bug on some weird target on release mode, you can just pick the same toolchain and options, cross-compile to the target, and debug your program under QEMU and be able to reproduce, debug, and fix the issue.

While there are operations that might become less accurate, the vast majority of computations will benefit (which is why I believe it should be the default).

With this RFC, the results do not only depend on the optimization level, but also on the optimizations that actually get performed. Compiling in debug mode, changing the debug-info level, or debugging using print statements are all things that affect which optimizations get applied and end up altering the floating-point results.

The 32-bit x86 without SSE target is the only targets mentioned for which debugging is already hard due to these issues. The only debugging tool one ends up having is "Look at the assembly" and hope that you can figure the bug out from there. That's a bad user experience even for rustc maintainers.

Having reported bugs for these targets and having seen people invest a lot of time into figuring them out I don't see how making all targets equally hard to debug by default is a good value proposition. It'd be much simpler to instead use soft-floats on weird targets by default while adding an option that allows users to opt-in to the x87 FPU (with a big "warning" that documents known issues). I have yet to run into an actual user that wants to do high-performance work on a x86 32-bit CPU without SSE in 2019, but if those users end up appearing, we could always invest time and effort into improving that opt-in option when that happens. That sounds much better to me than lowering the debuggability of all other targets to "32-bit x86 without SSE" standards.

programmerjake · 2019-10-25T09:25:47Z

I think it may be more useful to have IEEE 754 compliant (no FP traps, round-to-nearest-even, FP exception flags are ignored as an output -- basically what LLVM assumes by default on most platforms) be the default, and optimizations that change the results (such as fast-math and some forms of vectorization) be opt-in (at at least function, crate, and binary levels). This will improve reproducibility and debuggability such that results can be relied on cross-platform (excluding differences in NaN encodings) with a minor performance loss on unusual platforms (x86 without SSE). IEEE 754 compliance would not apply to SIMD types by default due to ARM (unfortunately) not supporting denormal numbers by default.

This is similar to how Rust has reproducible results for integer overflow/wrapping cross-platform even though C allows some forms of integer overflow to be undefined behavior.

eaglgenes101 · 2019-10-25T14:16:56Z

For me it is UB in the sense that the code's behavior is not specified and can, thus, vary on different platform in a way that is not predictable by the user.

We call that unspecified behavior around here. Values which do not have a data dependency on the results of these computations are unaffected by the choice of semantics for floating point.

Ixrec · 2019-10-25T16:07:17Z

For completeness: there's an ongoing discussion over exactly what terminology we should use for this sort of thing in Rust (rust-lang/unsafe-code-guidelines#201), though it'll probably be something similar to "unspecified" or "implementation-defined".

Back on-topic: it seems clear that we should be looking into fine-grained opt-in mechanisms for fast-math-y things before we seriously consider any changes to the global default behavior. In particular, #2686 (comment) is exactly what I think we should do.

hanna-kruppe · 2019-10-27T14:20:35Z

@RalfJung and others who flirt with -ffast-math:

There seems to be a spectrum of "IEEE conformance", with full conformance on one end, "whatever C does per default" somewhere in the middle (where it will e.g. use x87 instructions), and full fast-math on the other end. If I read this proposal correctly, it proposes to make the Rust default the same as / close to the C default. But I do not see any good reason for picking this particular spot on the spectrum other than "C did it" (the claim that this never reduces accuracy has been refuted, from what I can tell). So if we ignore C, IMO the most obvious choices for the default are "fully conformant" or "fully fast-math", and the RFC does not do a good enough job arguing for why we should pick another default on some "random" spot in the middle.

Arguments for the default position on the spectrum are indeed needed, so let me try to supply some. I am still not in favor of this RFC, but I think it is much better than the equivalent of -ffast-math.

First off, -ffast-math allows the optimizer to assume certain values (NaNs, infinities, negative zeros, subnormals) can't exist even though they can actually occur at runtime. This is a (practically unavoidable) gateway to unsoundness, e.g. in LLVM an instruction with nnan flag that sees a NaN produces poison. So we certainly can't have that as our default behavior.

So if we exclude that, we still got the following on top of what's allows in the RFC (probably a non-exhaustive list, but should include everything clang -ffast-math does at least):

rewriting ... / x to ... * (1 / x)
approximating built-in functions like sin (less precision for higher performance)
reassociation

IMO there is no strong reason to include or exclude (1) so whatever.

On the other hand, (2) is a very broad license to the compiler (there's no rules about how imprecise it can get) and one that is hard to make good use of in practice (because the compiler generally can't know what level of precision is acceptable for all of its users). Moreover, unless you're targeting a rather specialized chip that has hardware instructions for approximating transcendental functions, you can probably achieve the same effect by just using a different libm, which Rust does not yet support super well but could learn to do without touching the semantics of built-in types and operations.

As for (3), while any change to rounding can break the correctness of some numerical algorithms and snowball into an overall loss of accuracy, increasing precision of intermediate results is rather mild in this respect compared to freely performing reassociation, which can more easily and more drastically affect the results. It is also very important for enabling automatic vectorization of reductions, so it's still commonly enabled, but its benefits are much smaller for code that is not vectorizable.

For these reasons, I am quite sure something roughly like the RFC's position on the spectrum is a reasonable tradeoff between performance improvements and program reliability. Definitely not the only reasonable option, but clearly superior to full-on -ffast-math as default.

RalfJung · 2019-11-02T15:08:37Z

@rkruppe thanks for pointing out that full fast-path can cause UB; I agree that that is indeed a qualitative "step" somewhere on the line of floating point conformance.

joshtriplett · 2020-09-02T17:43:50Z

I'd like to formally withdraw this RFC. I still think this is a good idea, and I think having this substantial optimization happen by default is important. But there are many concerns that need to be dealt with, and we'd likely need some better ways to opt out of or into this, at both a library-crate level and a project level. I don't have the bandwidth to do that design work at this time, so I'm going to close this.

If someone would be interested in working on the general issue of floating-point precision, FMA, and similar, I would be thrilled to serve as the liaison for it.

jedbrown · 2020-11-18T15:56:40Z

Is there any way at present to enable floating point contractions and/or associative math without dropping to intrinsics? Seeming inability to write things like a good dot product (e.g., https://godbolt.org/z/Y35sda) without intrinsics is a critical issue for adoption in numerical/scientific computing.

I think attributes of the #[fp(contract = "fast", associative = "on")] variety have the lowest cognitive load for people transitioning from C or Fortran. These can be opt-in at crate/module/function/block granularity. Encoding via types seems more intrusive to me -- by far the most common situation is that numerical libraries/apps want moderate permissiveness enabled everywhere except in some critical places. Note that icc enables -fp-model fast=1 by default, which is nearly analogous to gcc/clang -ffast-math.

bend-n · 2023-11-08T14:14:11Z

Is there any way at present to enable floating point contractions and/or associative math without dropping to intrinsics? Seeming inability to write things like a good dot product (e.g., godbolt.org/z/Y35sda) without intrinsics is a critical issue for adoption in numerical/scientific computing.

There isnt, but ive made a crate which allows you to use the faster floats without, yknow, great inconvenience.

Allow floating-point operations to provide extra precision than speci…

c911f31

…fied, as an optimization This enables optimizations such as fused multiply-add operations by default.

joshtriplett added the T-lang Relevant to the language team, which will review and decide on the RFC. label Apr 17, 2019

hanna-kruppe reviewed Apr 17, 2019

View reviewed changes

Centril added A-arithmetic Arithmetic related proposals & ideas A-attributes Proposals relating to attributes A-flags Proposals relating to rustc flags or flags for other tools. A-primitive Primitive types related proposals & ideas labels Apr 17, 2019

gnzlbg reviewed Apr 18, 2019

View reviewed changes

text/0000-floating-point-additional-precision.md Show resolved Hide resolved

gnzlbg reviewed Apr 18, 2019

View reviewed changes

text/0000-floating-point-additional-precision.md Outdated Show resolved Hide resolved

joshtriplett added 2 commits April 18, 2019 13:27

Clarify explanation of the effect of extra precision

7ddfbcd

Add clarification that this is *not* the equivalent of "fast math" fr…

f9fffdf

…om other languages "fast math" is widely perceived as an unsafe option to go faster by sacrificing accuracy.

gThorondorsen reviewed Apr 26, 2019

View reviewed changes

RalfJung reviewed Oct 17, 2019

View reviewed changes

RalfJung reviewed Oct 25, 2019

View reviewed changes

comex mentioned this pull request Dec 4, 2019

floating point to integer casts can cause undefined behaviour rust-lang/rust#10184

Closed

comex mentioned this pull request Jun 15, 2020

Comparing to infinity is buggy on x87 rust-lang/rust#72327

Closed

joshtriplett closed this Sep 2, 2020

RalfJung mentioned this pull request Sep 20, 2020

Our floating point semantics were a mess rust-lang/unsafe-code-guidelines#237

Closed

RalfJung mentioned this pull request Oct 13, 2020

Must a const fn behave exactly the same at runtime as at compile-time? rust-lang/rust#77745

Closed

comex mentioned this pull request Dec 29, 2022

Relax const-eval restrictions #3352

Closed

RalfJung mentioned this pull request Oct 15, 2023

add float semantics RFC #3514

Merged

Allow floating-point operations to provide extra precision than specified, as an optimization #2686

Allow floating-point operations to provide extra precision than specified, as an optimization #2686

Conversation

joshtriplett commented Apr 17, 2019 • edited Loading

joshtriplett commented Apr 17, 2019

hanna-kruppe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnzlbg Apr 18, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fenrus75 Apr 21, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joshtriplett commented Apr 17, 2019

ExpHP commented Apr 18, 2019

fenrus75 commented Apr 18, 2019 via email

joshtriplett commented Apr 18, 2019 via email

Choose a reason for hiding this comment

gnzlbg Apr 18, 2019 • edited Loading

Choose a reason for hiding this comment

joshtriplett Apr 18, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joshtriplett Apr 18, 2019 • edited Loading

Choose a reason for hiding this comment

gnzlbg Apr 18, 2019 • edited Loading

Choose a reason for hiding this comment

hanna-kruppe Apr 20, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnzlbg commented Apr 18, 2019 • edited Loading

scottmcm commented Apr 18, 2019

eaglgenes101 commented Apr 18, 2019 • edited Loading

joshtriplett commented Apr 18, 2019

joshtriplett commented Apr 18, 2019 • edited Loading

gnzlbg commented Apr 18, 2019

joshtriplett commented Apr 18, 2019 • edited Loading

joshtriplett commented Apr 18, 2019 • edited Loading

scottmcm commented Apr 18, 2019

Lokathor commented Apr 25, 2019 • edited Loading

fenrus75 commented Apr 25, 2019

programmerjake commented Apr 25, 2019

joshtriplett commented Apr 25, 2019 via email

joshtriplett commented Apr 25, 2019 via email

gThorondorsen commented Apr 26, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalfJung commented Oct 17, 2019

RalfJung Oct 17, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nestordemeure commented Oct 24, 2019

RalfJung commented Oct 24, 2019

nestordemeure commented Oct 25, 2019

Choose a reason for hiding this comment

RalfJung commented Oct 25, 2019 • edited Loading

gnzlbg commented Oct 25, 2019 • edited Loading

programmerjake commented Oct 25, 2019

eaglgenes101 commented Oct 25, 2019

Ixrec commented Oct 25, 2019

hanna-kruppe commented Oct 27, 2019

RalfJung commented Nov 2, 2019

joshtriplett commented Sep 2, 2020

jedbrown commented Nov 18, 2020

bend-n commented Nov 8, 2023

joshtriplett commented Apr 17, 2019 •

edited

Loading

gnzlbg Apr 18, 2019 •

edited

Loading

fenrus75 Apr 21, 2019 •

edited

Loading

gnzlbg Apr 18, 2019 •

edited

Loading

joshtriplett Apr 18, 2019 •

edited

Loading

joshtriplett Apr 18, 2019 •

edited

Loading

gnzlbg Apr 18, 2019 •

edited

Loading

hanna-kruppe Apr 20, 2019 •

edited

Loading

gnzlbg commented Apr 18, 2019 •

edited

Loading

eaglgenes101 commented Apr 18, 2019 •

edited

Loading

joshtriplett commented Apr 18, 2019 •

edited

Loading

joshtriplett commented Apr 18, 2019 •

edited

Loading

joshtriplett commented Apr 18, 2019 •

edited

Loading

Lokathor commented Apr 25, 2019 •

edited

Loading

RalfJung Oct 17, 2019 •

edited

Loading

RalfJung commented Oct 25, 2019 •

edited

Loading

gnzlbg commented Oct 25, 2019 •

edited

Loading