-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cranelift: avoid load coalescing when an operand is reused #3953
Labels
cranelift:area:x64
Issues related to x64 codegen
cranelift
Issues related to the Cranelift code generator
Comments
abrown
added a commit
to abrown/wasmtime
that referenced
this issue
Mar 21, 2022
Fuzz testing identified a lowering case for CLIF's `icmp` in which the double use of a loaded operand resulted in a register allocation error. This change manually adds `put_in_xmm` to avoid load-coalescing these values and includes a CLIF filetest to trigger this issue. Closes bytecodealliance#3951. I opened bytecodealliance#3953 to discuss a way in which this kind of mistake (i.e., forgetting to add `put_in_*` in certain situations) could be avoided.
abrown
added a commit
to abrown/wasmtime
that referenced
this issue
Mar 22, 2022
Fuzz testing identified a lowering case for CLIF's `icmp` in which the double use of a loaded operand resulted in a register allocation error. This change manually adds `put_in_xmm` to avoid load-coalescing these values and includes a CLIF filetest to trigger this issue. Closes bytecodealliance#3951. I opened bytecodealliance#3953 to discuss a way in which this kind of mistake (i.e., forgetting to add `put_in_*` in certain situations) could be avoided.
abrown
added a commit
that referenced
this issue
Mar 22, 2022
Fuzz testing identified a lowering case for CLIF's `icmp` in which the double use of a loaded operand resulted in a register allocation error. This change manually adds `put_in_xmm` to avoid load-coalescing these values and includes a CLIF filetest to trigger this issue. Closes #3951. I opened #3953 to discuss a way in which this kind of mistake (i.e., forgetting to add `put_in_*` in certain situations) could be avoided.
alexcrichton
added
cranelift
Issues related to the Cranelift code generator
cranelift:area:x64
Issues related to x64 codegen
labels
Mar 23, 2022
cfallin
added a commit
to cfallin/wasmtime
that referenced
this issue
Apr 21, 2022
…c in lowering. This PR addresses the longstanding issue with loads trying to merge into compares on x86-64, and more generally, with the lowering framework falsely recognizing "single uses" of one op by another (which would normally allow merging of side-effecting ops like loads) when there is *indirect* duplication. To fix this, we replace the direct `value_uses` count with a transitive notion of uniqueness (not unlike Rust's `&`/`&mut` and how a `&mut` downgrades to `&` when accessed through another `&`!). A value is used multiple times transitively if it has multiple direct uses, or is used by another op that is used multiple times transitively. The canonical example of badness is: ``` v1 := load v2 := ifcmp v1, ... v3 := selectif v2, ... v4 := selectif v2, ... ``` both `v3` and `v4` effectively merge the `ifcmp` (`v2`), so even though the use of `v1` is "unique", it is codegenned twice. This is why we ~~can't have nice things~~ can't merge loads into compares (bytecodealliance#3953). There is quite a subtle and interesting design space around this problem and how we might solve it. See the long doc-comment on `ValueUseState` in this PR for more justification for the particular design here. In particular, this design deliberately simplifies a bit relative to an "optimal" solution: some uses can *become* unique depending on merging, but we don't design our data structures for such updates because that would require significant extra costly tracking (some sort of transitive refcounting). For example, in the above, if `selectif` somehow did not merge `ifcmp`, then we would only codegen the `ifcmp` once into its result register (and use that register twice); then the load *is* uniquely used, and could be merged. But that requires transitioning from "multiple use" back to "unique use" with careful tracking as we do pattern-matching, which I've chosen to make out-of-scope here for now. In practice, I don't think it will matter too much (and we can always improve later). With this PR, we can now re-enable load-op merging for compares. A subsequent commit does this.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
cranelift:area:x64
Issues related to x64 codegen
cranelift
Issues related to the Cranelift code generator
In examining #3951 and #3934, the main problem in both cases seems to be that the load coalescing logic cannot detect when a loaded operand may be reused. Because of this, we must manually ensure that certain lowerings use
put_in_*
to avoid the problem. This is too manual and this error-prone (e.g., we unwittingly merged a regalloc error in #3951).As @cfallin notes in #3934:
So load coalescing should be prevented in cases where the operand is attempted to be loaded twice. If we could check how many times a value is used in
is_mergeable_load
, we could avoid the need to remember to useput_in_*
in these cases.The text was updated successfully, but these errors were encountered: