Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize out nop-matches #66234

Open
oli-obk opened this issue Nov 9, 2019 · 6 comments
Open

Optimize out nop-matches #66234

oli-obk opened this issue Nov 9, 2019 · 6 comments
Labels
A-codegen Area: Code generation A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html A-mir-opt Area: MIR optimizations A-mir-opt-inlining Area: MIR inlining C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@oli-obk
Copy link
Contributor

oli-obk commented Nov 9, 2019

The ? operator is not zero cost right now, because the matches generated by it don't get optimized as well as they could.

Looking at https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=627df52e7c476667ecf9a9831eecf829

I see

_3 = ((_1 as Ok).0: u32);
_4 = _3;
((_0 as Ok).0: u32) = move _4;
discriminant(_0) = 0;

and

_5 = ((_1 as Err).0: i32);
_6 = _5;
((_0 as Err).0: i32) = move _6;
discriminant(_0) = 1;

Which we could reasonably write a peephole optimization for getting transformed to

_0 = _1

each

Then a second optimization could find switchInt terminators where all basic blocks are the same and eliminate the switchInt by replacing it to a goto to the first basic block being switched to.

This will even benefit matches where only one arm is a nop, because that arm will just become a memcopy

@Centril Centril added A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 9, 2019
@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 9, 2019

Ideally we'd not even limit ourselves to going from an enum to the same enum, but moving from one enum to another as long as the layouts match up. Though that may require the insertion of a transmute call. We can start with just figuring out that an enum variant gets all its fields read and then the variant is recreated to produce the exact same variant.

@Centril Centril added I-slow Issue: Problems and improvements with respect to performance of generated code. A-codegen Area: Code generation C-enhancement Category: An issue proposing an enhancement or a PR with one. labels Nov 9, 2019
@Aaron1011
Copy link
Member

Ideally we'd not even limit ourselves to going from an enum to the same enum, but moving from one enum to another as long as the layouts match up.

This could apply to structs as well. For example, in:

struct Foo {
    a: usize,
    b: bool
}

fn bar(val: Foo) -> Foo {
    let new = Foo {
        a: val.a,
        b: val.b
    };
    new
}

bar could be optimized to a single assignment.

@Centril Centril self-assigned this Nov 10, 2019
@ghost
Copy link

ghost commented Nov 10, 2019

Does this issue have something to do with the performance problem described in #37939?

bors added a commit that referenced this issue Nov 11, 2019
[WIP] [mir-opt] asking `?`s in a more optimized fashion

This PR works towards #66234 by providing two optimization passes meant to run in sequence:

- `SimplifyArmIdentity` which transforms something like:
  ```rust
  _LOCAL_TMP = ((_LOCAL_1 as Variant ).FIELD: TY );
  ((_LOCAL_0 as Variant).FIELD: TY) = move _LOCAL_TMP;
  discriminant(_LOCAL_0) = VAR_IDX;
  ```

  into:

  ```rust
  _LOCAL_0 = move _LOCAL_1
  ```

- `SimplifyBranchSame` which transforms `SwitchInt`s to identical basic blocks into a `goto` to the first reachable target.

Together, these are meant to simplify the following into just `res`:
```rust
match res {
    Ok(x) => Ok(x),
    Err(x) => Err(x),
}
```

It should be noted however that the desugaring of `?` includes a function call and so the first pass in this PR relies on inlining to substitute that function call for identity on `x`. Inlining requires `mir-opt-level=2` so this might not have any effect in perf-bot but let's find out.

r? @oli-obk -- This is WIP, but I'd appreciate feedback. :)
@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 11, 2019

Does this issue have something to do with the performance problem described in #37939?

yes. That's what inspired this optimization idea

bors added a commit that referenced this issue Nov 22, 2019
[mir-opt] asking `?`s in a more optimized fashion

This PR works towards #66234 by providing two optimization passes meant to run in sequence:

- `SimplifyArmIdentity` which transforms something like:
  ```rust
  _LOCAL_TMP = ((_LOCAL_1 as Variant ).FIELD: TY );
  ((_LOCAL_0 as Variant).FIELD: TY) = move _LOCAL_TMP;
  discriminant(_LOCAL_0) = VAR_IDX;
  ```

  into:

  ```rust
  _LOCAL_0 = move _LOCAL_1
  ```

- `SimplifyBranchSame` which transforms `SwitchInt`s to identical basic blocks into a `goto` to the first reachable target.

Together, these are meant to simplify the following into just `res`:
```rust
match res {
    Ok(x) => Ok(x),
    Err(x) => Err(x),
}
```

It should be noted however that the desugaring of `?` includes a function call and so the first pass in this PR relies on inlining to substitute that function call for identity on `x`. Inlining requires `mir-opt-level=2` so this might not have any effect in perf-bot but let's find out.

r? @oli-obk -- This is WIP, but I'd appreciate feedback. :)
@Mark-Simulacrum
Copy link
Member

We've landed the first iteration here (#66282) and the followup for enabling this more widely is likely closing out #66855 so I'm closing this issue in favor of that.

@Centril
Copy link
Contributor

Centril commented Dec 6, 2019

There's still some work here to be done:

  • We check for an exact match in the enum types, but different types with the same layouts are also fine from an operational POV.

  • ? being optimized requires -Zmir-opt-level=2 so that the inlining pass triggers. We might want to consider #[rustc_mir_inline] or some such on targeted functions so that we do optimize ? without -Zmir-opt-level=2. Alternatively, we fix the inlining pass somehow.

@Centril Centril reopened this Dec 6, 2019
@Centril Centril removed their assignment Feb 13, 2020
@jonas-schievink jonas-schievink added the A-mir-opt Area: MIR optimizations label Mar 23, 2020
@ecstatic-morse ecstatic-morse added the A-mir-opt-inlining Area: MIR inlining label Apr 27, 2020
@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html A-mir-opt Area: MIR optimizations A-mir-opt-inlining Area: MIR inlining C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants