JIT: Quiet NaNs on x86 hosts #75338

jakobbotsch · 2022-09-09T12:49:32Z

Since the ABI specifies that float values are returned on x87 stack, and since the fld instruction will quiet NaNs, it is very easy for the bit representation of float literals to change during JIT. For example, inlining of a function or link-time code generation changing the ABI of a function can cause such behavior. This changes the JIT to quiet all floating point NaNs stored as JIT IR constants on this host platform.

Also adds printing of the bit representation when NaNs are printed as part of IR.

Fix #75221

Since the ABI specifies that float values are returned on x87 stack, and since the fld instruction will quiet NaNs, it is very easy for the bit representation of float literals to change during JIT. For example, inlining of a function or link-time code generation changing the ABI of a function can cause such behavior. This changes the JIT to quiet all floating point NaNs stored as JIT IR constants on this host platform. Fix dotnet#75221

ghost · 2022-09-09T12:49:44Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Since the ABI specifies that float values are returned on x87 stack, and since the fld instruction will quiet NaNs, it is very easy for the bit representation of float literals to change during JIT. For example, inlining of a function or link-time code generation changing the ABI of a function can cause such behavior. This changes the JIT to quiet all floating point NaNs stored as JIT IR constants on this host platform.

Also adds printing of the bit representation when NaNs are printed as part of IR.

Fix #75221

Author:	jakobbotsch
Assignees:	-
Labels:	`area-CodeGen-coreclr`
Milestone:	-

jakobbotsch · 2022-09-09T12:51:22Z

@jkotas @tannergooding does this seem reasonable? It potentially introduces differences in cross compilation and host compilation scenarios (although those are likely there today in this scenario too). If you think the fix is reasonable we can also do this generally on all platforms instead.

tannergooding · 2022-09-09T14:39:25Z

src/coreclr/jit/utils.cpp

+//   returns the value is inlined or not by the C++ compiler. To get around the
+//   nondeterminism we quiet the NaNs ahead of time as a best-effort fix.
+//
+double FloatingPointUtils::normalize(double value)


This seems reasonable to me and is should be within the allowed handling by IEEE 754.

The potential downside is that some advanced user is trying to utilize such constants with an advanced SIMD algorithm and the relevant floating-point bitwise manipulation instructions (andps, orps, xorps, etc).

I expect the most likely bitwise values used will be:

AllBitsSet - A qNaN

NegativeZero - Only sign bit set

PositiveInfinity - All exponent bits set

All mantissa bits set - A subnormal number

None of these should be prone to complications from this normalization. All mantissa bits set has a potential separate issue where it "can" be subject to an FPU mode where subnormal values are normalized to Zero. However, we don't support this and none of our current target platforms do this except for Arm32 where it doesn't have a relevant control switch for SIMD (scalar logic is not subject to the same issue).

Unfortunately we cannot do something clever like utilize the relevant MMX instructions to bypass FLD. This is because while mm0 maps to FP(0), the MMX registers only expose the lower 64-bits and set the upper 16-bits to all ones, there-by changing the value that a x87 FPU instruction would see.

We could, however, change the managed calling convention to just not use the x87 FPU entirely. Then this value normalization would only be a consideration when interoping with native code. That may also be faster and simplify some of the overall handling given that we don't use the x87 FPU for anything other than returning floating-point values on x86.

We could, however, change the managed calling convention to just not use the x87 FPU entirely. Then this value normalization would only be a consideration when interoping with native code. That may also be faster and simplify some of the overall handling given that we don't use the x87 FPU for anything other than returning floating-point values on x86.

The original problem here is in the JIT's code generated by MSVC. We get non-determinism because the code produced by MSVC sometimes will quiet the NaNs and sometimes won't, depending on inlining/link-time code generation.

Changing the managed calling convention here is a bit more work than I'd like to take on at this point, but I agree that it is likely users can hit the same behavior with the JIT that we are hitting with MSVC, depending on inlining.

jkotas · 2022-09-09T14:58:38Z

We seem to be wrapping more and more of C/C++ float operations to make them predictable for the JIT purposes. Should we introduce a new a type in the JIT codebase for this purpose?

This type would use int32/int64 as underlying type
All floating point operations would be implemented on this type in a predictable way
Change all places where we pass around floating point constants in the JIT to use this type

jakobbotsch · 2022-09-09T15:34:40Z

We seem to be wrapping more and more of C/C++ float operations to make them predictable for the JIT purposes. Should we introduce a new a type in the JIT codebase for this purpose?

This type would use int32/int64 as underlying type

All floating point operations would be implemented on this type in a predictable way

I've talked about this a bit with @AndyAyersMS who did this on a previous compiler. Not sure if we have a ready-to-go implementation of such a type somewhere that we could just use.

Change all places where we pass around floating point constants in the JIT to use this type

I don't think it would be enough to only treat floating point constants this way, at least if we want to guarantee determinism between debug/checked/release and cross-compiling vs host-compiling. The recent other example I can remember in the area was #58063 (comment) which was just internal JIT calculations using floats. Both these examples are x86 only, so seems like this would primarily be an investment in x86 (but maybe there are other cases I have missed?).

AndyAyersMS · 2022-09-09T22:11:02Z

I've talked about this a bit with @AndyAyersMS who did this on a previous compiler. Not sure if we have a ready-to-go implementation of such a type somewhere that we could just use.

I don't know of any.

this would primarily be an investment in x86

I wonder if we could mitigate most of these issues by building the x86 jit with _vectorcall? As far as I know the MSVC++ compiler (when SSE2 is allowed) tries to avoid x87 except at ABI boundaries or when it runs into inline asm or similar.

AndyAyersMS

I think we should take this fix but consider thinking whether something more comprehensive is warranted.

jakobbotsch · 2022-09-12T09:34:26Z

I wonder if we could mitigate most of these issues by building the x86 jit with _vectorcall? As far as I know the MSVC++ compiler (when SSE2 is allowed) tries to avoid x87 except at ABI boundaries or when it runs into inline asm or similar.

Right, I mentioned this as a solution in #75221 too. I tried it but I hit many linkers errors, so sort of gave up on it as I wasn't even sure if changing the default was feasible for us. It does fix the problem if you mark the problematic functions as __vectorcall.

jakobbotsch · 2022-09-12T09:40:16Z

I'm going to merge this as-is at this point. If we identify more broadly impacting issues due to the use of floats we should definitely consider investing in some of the alternatives to solve this for-good.

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 9, 2022

ghost assigned jakobbotsch Sep 9, 2022

tannergooding reviewed Sep 9, 2022

View reviewed changes

AndyAyersMS approved these changes Sep 10, 2022

View reviewed changes

jakobbotsch merged commit 99d21b9 into dotnet:main Sep 12, 2022

jakobbotsch deleted the fix-75221 branch September 12, 2022 09:40

jakobbotsch mentioned this pull request Sep 29, 2022

Failures in checked/release asm diffs #76347

Closed

ghost locked as resolved and limited conversation to collaborators Oct 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Quiet NaNs on x86 hosts #75338

JIT: Quiet NaNs on x86 hosts #75338

jakobbotsch commented Sep 9, 2022

ghost commented Sep 9, 2022

jakobbotsch commented Sep 9, 2022 •

edited

Loading

tannergooding Sep 9, 2022

jakobbotsch Sep 9, 2022 •

edited

Loading

jkotas commented Sep 9, 2022

jakobbotsch commented Sep 9, 2022

AndyAyersMS commented Sep 9, 2022

AndyAyersMS left a comment

jakobbotsch commented Sep 12, 2022

jakobbotsch commented Sep 12, 2022

JIT: Quiet NaNs on x86 hosts #75338

JIT: Quiet NaNs on x86 hosts #75338

Conversation

jakobbotsch commented Sep 9, 2022

ghost commented Sep 9, 2022

jakobbotsch commented Sep 9, 2022 • edited Loading

tannergooding Sep 9, 2022

Choose a reason for hiding this comment

jakobbotsch Sep 9, 2022 • edited Loading

Choose a reason for hiding this comment

jkotas commented Sep 9, 2022

jakobbotsch commented Sep 9, 2022

AndyAyersMS commented Sep 9, 2022

AndyAyersMS left a comment

Choose a reason for hiding this comment

jakobbotsch commented Sep 12, 2022

jakobbotsch commented Sep 12, 2022

jakobbotsch commented Sep 9, 2022 •

edited

Loading

jakobbotsch Sep 9, 2022 •

edited

Loading