Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cranelift: avoid load coalescing when an operand is reused #3953

Closed
abrown opened this issue Mar 21, 2022 · 0 comments
Closed

Cranelift: avoid load coalescing when an operand is reused #3953

abrown opened this issue Mar 21, 2022 · 0 comments
Labels
cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator

Comments

@abrown
Copy link
Contributor

abrown commented Mar 21, 2022

In examining #3951 and #3934, the main problem in both cases seems to be that the load coalescing logic cannot detect when a loaded operand may be reused. Because of this, we must manually ensure that certain lowerings use put_in_* to avoid the problem. This is too manual and this error-prone (e.g., we unwittingly merged a regalloc error in #3951).

As @cfallin notes in #3934:

Merging a load into an op that executes more than once is invalid in general (the two loads may observe different values, which violates the original program semantics because there was only one load originally).

So load coalescing should be prevented in cases where the operand is attempted to be loaded twice. If we could check how many times a value is used in is_mergeable_load, we could avoid the need to remember to use put_in_* in these cases.

abrown added a commit to abrown/wasmtime that referenced this issue Mar 21, 2022
Fuzz testing identified a lowering case for CLIF's `icmp` in which the
double use of a loaded operand resulted in a register allocation error.
This change manually adds `put_in_xmm` to avoid load-coalescing these
values and includes a CLIF filetest to trigger this issue. Closes bytecodealliance#3951.

I opened bytecodealliance#3953 to discuss a way in which this kind of mistake (i.e.,
forgetting to add `put_in_*` in certain situations) could be avoided.
abrown added a commit to abrown/wasmtime that referenced this issue Mar 22, 2022
Fuzz testing identified a lowering case for CLIF's `icmp` in which the
double use of a loaded operand resulted in a register allocation error.
This change manually adds `put_in_xmm` to avoid load-coalescing these
values and includes a CLIF filetest to trigger this issue. Closes bytecodealliance#3951.

I opened bytecodealliance#3953 to discuss a way in which this kind of mistake (i.e.,
forgetting to add `put_in_*` in certain situations) could be avoided.
abrown added a commit that referenced this issue Mar 22, 2022
Fuzz testing identified a lowering case for CLIF's `icmp` in which the
double use of a loaded operand resulted in a register allocation error.
This change manually adds `put_in_xmm` to avoid load-coalescing these
values and includes a CLIF filetest to trigger this issue. Closes #3951.

I opened #3953 to discuss a way in which this kind of mistake (i.e.,
forgetting to add `put_in_*` in certain situations) could be avoided.
@alexcrichton alexcrichton added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen labels Mar 23, 2022
cfallin added a commit to cfallin/wasmtime that referenced this issue Apr 21, 2022
…c in lowering.

This PR addresses the longstanding issue with loads trying to merge
into compares on x86-64, and more generally, with the lowering
framework falsely recognizing "single uses" of one op by
another (which would normally allow merging of side-effecting ops like
loads) when there is *indirect* duplication.

To fix this, we replace the direct `value_uses` count with a
transitive notion of uniqueness (not unlike Rust's `&`/`&mut` and how
a `&mut` downgrades to `&` when accessed through another `&`!). A
value is used multiple times transitively if it has multiple direct
uses, or is used by another op that is used multiple times
transitively.

The canonical example of badness is:

```
    v1 := load
    v2 := ifcmp v1, ...
    v3 := selectif v2, ...
    v4 := selectif v2, ...
```

both `v3` and `v4` effectively merge the `ifcmp` (`v2`), so even
though the use of `v1` is "unique", it is codegenned twice. This is
why we ~~can't have nice things~~ can't merge loads into
compares (bytecodealliance#3953).

There is quite a subtle and interesting design space around this
problem and how we might solve it. See the long doc-comment on
`ValueUseState` in this PR for more justification for the particular
design here. In particular, this design deliberately simplifies a bit
relative to an "optimal" solution: some uses can *become* unique
depending on merging, but we don't design our data structures for such
updates because that would require significant extra costly
tracking (some sort of transitive refcounting). For example, in the
above, if `selectif` somehow did not merge `ifcmp`, then we would only
codegen the `ifcmp` once into its result register (and use that
register twice); then the load *is* uniquely used, and could be
merged. But that requires transitioning from "multiple use" back to
"unique use" with careful tracking as we do pattern-matching, which
I've chosen to make out-of-scope here for now. In practice, I don't
think it will matter too much (and we can always improve later).

With this PR, we can now re-enable load-op merging for compares. A
subsequent commit does this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

No branches or pull requests

2 participants