-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Make floats non-NaN by default #11234
Comments
Can you address the following operations of IEEE 754 floats?
(Thanks @rtfeldman for pointing these out to me.) How do these operations fit into your vision? It looks like you are saying:
What might that look like in machine code for each of these operations? Can we estimate whether such safety checks in debug builds will be reasonable, as they are with overflow arithmetic, or whether they might be debilitatingly slow? |
One possibility would be to always return an optional float for all floating-point operations, whether the operands themselves are optional or not. Then, a coercion from
x64 and arm64 will appropriately set a flag register if a floating-point compare operand is NaN. From there, you should be able to conditionally branch on that flag, for a total of two instructions for a NaN check. It looks like x64's |
I like the idea of encoding NaNs in the type system! I have a reservation to do with UB--in something very dynamic like a physics engine for a game, it's very tough to be 100% certain that there's no way to push the system past its limits and get a NaN (likely by first getting an infinity.) Due to how NaNs propagate, this kind of glitch tends to be contained to a single object in the game, but if getting a NaN unexpectedly is UB then it becomes a much more serious concern. I'd propose having debug builds check |
Great question. My original thinking was that "responsible code" would check for these cases pre-emptively. However, I'm starting to think panics upon producing NaN is the wrong way to go. Here's why: It can be more performant to propagate invalid errors throughout a computation and check the result at the end, rather than to pre-emptively avoid generating invalid values. For example: fn sum(vec: []const f32) f32 {
var accum = vec[0];
for (vec[1..]) |x| {
accum += x;
}
return accum;
} According to the original proposal, this would panic if vec contained an "Infinity" followed by "-Infinity". Adding an The right choice is to make accum an AmendmentI'd like to amend the proposal so that all floating point operations return This would achieve what the original proposal intended: it prevents invalid comparisons with NaN and unintentional storage of NaN, which are the primary footguns responsible for the unexpected behavior mentioned in the introduction. Finally, it's still worth adding safety panics upon generating NaN or Inf in blocks with |
@MasonRemaley I believe your concern is addressed by the latest amendment, since floating point optionals would have the same safety checks as other optionals in Zig (only |
Another thought occurred to me: For example: var x: f32 = 1.5;
var accum: f32 = 0;
accum = (accum + x).?; // implies `nnan`
accum = @assertFinite(accum + x); // implies `nnan` and `ninf` could be lowered to the following LLVM IR: %1 = load float, float* %accum, align 4, !dbg !2882
%2 = load float, float* %x, align 4, !dbg !2883
%3 = fadd nnan float %1, %2, !dbg !2884 ; optimizations enabled by nnan
...
%6 = fadd nnan ninf float %4, %5, !dbg !2887 ; optimizations enabled by nnan and ninf
store float %6, float* %accum, align 4, !dbg !2887 With these assumptions made explicit, Existing |
I think encoding nan-ability in the type system is an interesting idea, but making it the null value of an optional feels very wrong to me. For one, there are many NaN values, and IEEE 754 defines rules for how the bits which can vary propagate through operations. This can be used in a technique called "NaN packing" or "NaN boxing" to associate extra data with NaN values, which can be used to indicate where they were first discovered or other attributes. Additionally, many CPUs designate one of these bits to indicate "signaling NaN" values, which will generate a CPU exception when used. None of that fits into this model. So that would instead leave us with the
|
I think the original post gives good arguments why Maybe obvious/unnecessary, but to distinguish floating-point number types, here's everything one _could_ expect from them:
I think it's reasonable to understand a floating-point number as a union of these states (let me know if there's more still). Constructing and destructuring could be provided by a library, though since floats are language-provided, Edit: Adding this paragraph because I'm unsure whether my main point came across well enough:
If we expose all of these combinations of options in Using Except for ergonomics, the first half would already make the language feature-complete in this regard.
(my thoughts)
|
I appreciate the detailed thoughts! Your alternative sounds a lot like tracking general bounds for floats (similar to what was proposed for integers in #3806), and I absolutely agree that would solve this problem and potentially bring other benefits, if the ergonomic and other challenges involved can be resolved 🙂 FWIW, Inf-able and NaN-able are the subsets critical for safety/optimization. These are the only subsets that interact |
So type systems help prevent bugs. Looking at an example average function:
The first implementation can overflow while the second can't. I think your proposition will force me to add explicit Nan checking to both implementation but doesn't help me write the correct implementation. |
@gwenzek I think "[helping someone] write the correct implementation" is a rather general statement of rather high expectations. Does an easy way to add an assertion meet your criterion?
The original post proposes arithmetic operators to be overloaded for Your particular example is about overflow though, which doesn't intersect with the original proposal at all. const finite32 = std.math.FiniteFloatNumber(32); //this type never represents NaN nor infinities
var avg1 = @floatCast(finite32, (x + y) / 2); //asserts no overflow => panics if infinity is reached
var avg2 = @floatCast(finite32, 0.5 * x + 0.5 * y); //asserts no overflow (which passes if x and y are finite) If we added an operator family, say var avg1 = (x +< y) /< 2; //asserts no overflow => panics if infinity is reached
var avg2 = 0.5 *< x +< 0.5 *< y; //asserts no overflow (which passes if x and y are finite) Edit: Also note that while |
(Tried stitching together some thoughts into a coherent set of paragraphs, apologies if this comes across like rambling, feel free to dismiss it if you believe it lacks substance). I think this proposal follows a trend I've seen in a couple of proposals that tend to run counter to the actual momentum of zig's design, in particular #7512 which, while not rejected, is pretty barren in terms of discussion, and has a comment under it by Andrew stating that it is unlikely to be accepted. The actual direction zig seems to be going in doesn't necessarily reject the introduction of new types for particular use cases, but it appears to have a preference for the addition of new operations on existing types to achieve particular semantics (wrapping and saturating arithmetic operators, This is all to say, I think that it would make more sense to add operators/builtins that allow one to produce the desired behavior, as is suggested by @rohlem, rather than overloading the usage of I would loosely draw a comparison between this, and the usages of RAII (Rust, C++, ...) vs |
Comparison with nan is the bug I'm trying to avoid: // insertion sort -- can you spot the bug?
fn sort_inplace(vec: []f32) void {
for (vec) |key, i| {
var j = i;
while (j > 0 and vec[j - 1] > key) {
vec[j] = vec[j - 1];
j = j - 1;
}
vec[j] = key;
}
} When given "{ 1.5, 0.5, 3.5, nan, 0.25, 6.0, 1.5 }" this returns "{ 0.5, 1.5, 3.5, nan, 0.25, 1.5, 6 }" Under this proposal, the function signature of This also intends to improve support for generic implementations. A function that accepts |
Perhaps in line with @InKryption 's thinking, there's a pared-down version of this proposal where we make comparisons between floats return The downside is that: (1) if a function asserts a non-NaN comparison internally, it won't be clear from its signature, and (2) generic functions will have to special-case this weird comparison result type. That might be worth it to avoid complicating the types, though |
Whilst I like the idea of controlling the handling of NaNs better, I feel this proposal overloads the concept of the option type in a distinctly unintuitive way from a language design standpoint. There has been a fair amount of discussion on the representation of a NaN and some of the optimisations this proposal can facilitate, but fundamentally I feel that an option represents nullability whilst a NaN is something completely orthogonal to this semantically. Phrased another way, I expect |
That's a very compelling argument If FWIW, the R statistical software use NaN for missing values as an important optimization, but it also distinguishes between NA (missing value) and NaN (invalid result) in the way you describe. NA gets a special NaN payload, with the caveat that R cannot guarantee any consistent answer to I'd also be curious to know what you think of the reduced change mentioned above: to simply make comparison between floats yield |
Uhh, doesn't changing floats from pretending to be an optional to pretending to be an error union undermine your key advantage here of improved compatibility with generics? |
Hashmap would work just fine, for example. Can you think of an example that's broken? |
Oh,,, I think I see what you mean now |
Updated proposal to use Also factored out two ideas that don't depend on the main proposal:
|
I'm using NaN boxing in a (dynamic) language runtime I'm building, but I wouldn't let the floating-point unit get ahold of any NaN values that I use to encode other things, so I don't think this proposal has any impact on NaN boxing.... or at least I can't see how it would. |
Introduction
The idea behind this proposal comes from observing that
NaN
has some surprising commonalities with null pointers:In combination with the arithmetic/comparison behavior of NaNs, these troubles lead to a number of footguns in real-life code.
Examples include sorting algorithms failing for NaN data, every NaN colliding in a hash map, NaNs propagating virally in streaming outputs, invalid image filtering when operating on NaNs, nan values persisting after filtering, parsers failing on NaN inputs, and formatting/display unintentionally exposing NaN to the user.
Footguns abound when there is disagreement about whether NaN needs to be handled correctly.
Proposal
Option A: Replace
f32
witherror{NaN}!f32
+-*/%
) yielderror{NaN}!f32
error{NaN}!f32
error{NaN}!f32
can be unwrapped withtry
,catch
, andif
like any other error unionf32
yieldsbool
. Comparison oferror{NaN}!f32
yieldserror{NaNOperand}!bool
Other error unions, such as
error{Foo}!f32
are not treated specially (no arithmetic, no special layout, etc.)."NaN-boxing" is to be supported via
getNaNPayload
andsetNaNPayload
Option B: Make comparisons of floats return
error{NaNOperand}!bool
This is a minimal change to the language that would force users to explicitly account for
NaN
in floating point comparisons, which is the central oversight in above-mentioned bugs.API Impacts
This means that "nan-safe" functions can be given a type that reflects their special handling of NaN. Meanwhile, highly-optimized routines that don't handle NaN correctly can be given a type that reflects their assumptions:
Example
Meanwhile, the code for a non-NaN-safe version of this function would look exactly like it does today.
Supplemental Ideas
These related ideas can be accepted/rejected separately from the main proposal:
Size Optimization for
?f32
: Define?f32
to be stored in a typical float, by assigning it a NaN payload with a special value. This is similar to R's "NA" value, except that?f32
would not support arithmetic or comparison (except withnull
), meaning that NA/NaN propagation is not an issue. It behaves like any other optional.@assertFinite
/@assertNonNaN
built-ins: The UB-introducing@setFloatMode(.Optimize)
assumptions are that inputs/outputs are non-Inf and non-NaN. All other fast-math optimization flags make a different performance/accuracy trade-off, but do not directly introducepoison
/undefined
into the program.@assertFinite
would allow the programmer to make these dangerous assumptions explicit in their code where it's obvious exactly which operands it affects.(1) can be particularly important for performance when operating on large, structured data, since it affects how many values can fit into a cache line. This is why it's common to see in statistical software, including R and Pandas.
Edit: Updated 4/11 to use
error{NaN}!f32
instead of?f32
+ add supplemental ideasThe text was updated successfully, but these errors were encountered: