-
Notifications
You must be signed in to change notification settings - Fork 822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In LLVM backend, track which floats are guaranteed to be arithmetic, which makes the canonicalization a no-op. #934
Conversation
pub struct ExtraInfo { | ||
state: u8, | ||
} | ||
impl ExtraInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to distinguish between f32
and f64
NaNs in ExtraInfo
, since the opcodes and values are already typed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of the SIMD opcodes which aren't typed. They're all just v128, so we might go from a F64x2Add to a F32x4Mul with no cast in between.
v128_to_f32x4
checks for a pending f64 canonicalization and applies it, while not applying a pending f32 canonicalization. v128_to_f64x2
does the opposite.
446b48c
to
6b2f17b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Looks like there's a few small mistakes but otherwise good!
CHANGELOG.md
Outdated
@@ -24,6 +24,7 @@ Special thanks to [@newpavlov](https://github.com/newpavlov) and [@Maxgy](https: | |||
- [#939](https://github.com/wasmerio/wasmer/pull/939) Fix bug causing attempts to append to files with WASI to delete the contents of the file | |||
- [#940](https://github.com/wasmerio/wasmer/pull/940) Update supported Rust version to 1.38+ | |||
- [#923](https://github.com/wasmerio/wasmer/pull/923) Fix memory leak in the C API caused by an incorrect cast in `wasmer_trampoline_buffer_destroy` | |||
- [#934](https://github.com/wasmerio/wasmer/pull/934) Simplify float expressions in the LLVM backend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changelog entry should be moved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
lib/llvm-backend/src/state.rs
Outdated
// machine, but which might not be in the LLVM value. The conversion to | ||
// arithmetic NaN is pending. It is required for correctness. | ||
PendingF32NaN, | ||
pub fn pending_f32_nan() -> ExtraInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions can be const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
lib/llvm-backend/src/state.rs
Outdated
ExtraInfo { state: 8 } | ||
} | ||
|
||
pub fn has_pending_f32_nan(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions may be able to be const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are. Done.
lib/llvm-backend/src/state.rs
Outdated
ExtraInfo::None | ||
ExtraInfo { state: 0 } | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same as derive(Default)
on ExtraInfo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it is! Done.
lib/llvm-backend/src/state.rs
Outdated
|
||
fn bitor(self, other: Self) -> Self { | ||
assert!(!(self.has_pending_f32_nan() && other.has_pending_f64_nan())); | ||
assert!(!(self.has_pending_f64_nan() && other.has_pending_f32_nan())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you only want these in debug mode, use debug_assert!
, these asserts
will run on every use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, done.
…ackend. Not wired up yet.
It seemed like a good idea at the time, but in practice we discard the extra info all or almost all of the time. This also introduces a new bug. In an operation like multiply, it's valid to multiply two values, one with a pending NaN and one without. As written, in the SIMD case (because of the two kinds of pending in play), we assert.
Unfortunately, this is quite buggy. For something as simple as F32Sub, to combine two ExtraInfos, we want to add a new pending_f32_nan(), unless both of the inputs are arithmetic_f32(). In this commit, we incorrectly calculate that we don't need a pending_f32_nan if either one of the inputs was arithmetic_f32().
We want to ignore the incoming pending NaN state (since the pending will propagate to the output if there was one on the input), and we want to add a new pending NaN state if we can (that is to say, if it isn't cancelled out by both inputs having arithmetic state). Do this by discarding the pending states on the inputs, intersecting them (to keep only the arithmetic state), then union in a pending nan state (which might do nothing, if it's arithmetic). If the above sounds confusing, keep in mind that when a value is arithmetic, the act of performing the "NaN canonicalization" is a no-op. Thus, being arithmetic cancels out pending NaN states.
Fix a bug in Operator::Select and add a comment to explain the intention. Use derived default for ExtraInfo. Make ExtraInfo associated functions const. Turn two asserts into debug_asserts.
6b2f17b
to
ff73c5d
Compare
bors r+ |
934: In LLVM backend, track which floats are guaranteed to be arithmetic, which makes the canonicalization a no-op. r=nlewycky a=nlewycky # Description This is a reimplementation of the patch in PR #651. Extend state.rs ExtraInfo to track more information about floats. In addition to tracking whether the value has a pending canonicalization of NaNs, also track whether the value is known to be arithmetic (which includes infinities, regular values, and non-signalling NaNs (aka. "arithmetic NaNs" in the webassembly spec)). When the value is arithmetic, the correct sequence of operations to canonicalize the value is a no-op. Therefore, we create a lattice where pending+arithmetic=arithmetic. Also, this extends the tracking to track all values, including non-SIMD integers. That's why there are more places where pending canonicalizations are applied. Looking at c-wasm-simd128-example, this provides no performance change to the non-SIMD case (takes 58s on my noisy dev machine). The SIMD case drops from 46s to 29s. # Review - [ ] Add a short description of the the change to the CHANGELOG.md file Co-authored-by: Nick Lewycky <nick@wasmer.io>
Build succeeded |
Description
This is a reimplementation of the patch in PR #651.
Extend state.rs ExtraInfo to track more information about floats. In addition to tracking whether the value has a pending canonicalization of NaNs, also track whether the value is known to be arithmetic (which includes infinities, regular values, and non-signalling NaNs (aka. "arithmetic NaNs" in the webassembly spec)). When the value is arithmetic, the correct sequence of operations to canonicalize the value is a no-op. Therefore, we create a lattice where pending+arithmetic=arithmetic.
Also, this extends the tracking to track all values, including non-SIMD integers. That's why there are more places where pending canonicalizations are applied.
Looking at c-wasm-simd128-example, this provides no performance change to the non-SIMD case (takes 58s on my noisy dev machine). The SIMD case drops from 46s to 29s.
Review