Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecesary discriminant checks in PartialEq/PartialOrd style code on enums #119014

Closed
Kmeakin opened this issue Dec 16, 2023 · 7 comments · Fixed by #122387
Closed

Unnecesary discriminant checks in PartialEq/PartialOrd style code on enums #119014

Kmeakin opened this issue Dec 16, 2023 · 7 comments · Fixed by #122387
Assignees
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@Kmeakin
Copy link
Contributor

Kmeakin commented Dec 16, 2023

I tried this code:
compiler explorer

pub enum Enum {
    A(u32),
    B(u32),
    C(u32),
}

#[no_mangle]
pub fn f1(lhs: &Enum, rhs: &Enum) -> bool {
    match (lhs, rhs) {
        (Enum::A(lhs), Enum::A(rhs)) => lhs == rhs,
        (Enum::B(lhs), Enum::B(rhs)) => lhs == rhs,
        (Enum::C(lhs), Enum::C(rhs)) => lhs == rhs,
        _ => false,
    }
}

#[no_mangle]
pub fn f2(lhs: &Enum, rhs: &Enum) -> bool {
    if std::mem::discriminant(lhs) != std::mem::discriminant(rhs) {
        return false;
    }

    match (lhs, rhs) {
        (Enum::A(lhs), Enum::A(rhs)) => lhs == rhs,
        (Enum::B(lhs), Enum::B(rhs)) => lhs == rhs,
        (Enum::C(lhs), Enum::C(rhs)) => lhs == rhs,
        _ => false,
    }
}

#[no_mangle]
pub fn f3(lhs: &Enum, rhs: &Enum) -> bool {
    if std::mem::discriminant(lhs) != std::mem::discriminant(rhs) {
        return false;
    }

    match (lhs, rhs) {
        (Enum::A(lhs), Enum::A(rhs)) => lhs == rhs,
        (Enum::B(lhs), Enum::B(rhs)) => lhs == rhs,
        (Enum::C(lhs), Enum::C(rhs)) => lhs == rhs,
        _ => unsafe { std::hint::unreachable_unchecked() },
    }
}

Code like this is often used for implementations of PartialEq or PartialOrd on enums, so it is important that it be optimized well.

I expected to see this happen:
The functions f1 and f2 should produce equivalent machine code to f3.

Instead, this happened:
f1 and f2 do not produce equivalent machine code.
f1 is missing the early return for when the discriminants do not match.
f2 unnecessarily examines the discriminant of lhs twice!
LLVM should be able to transform f1 into f2, and f2 into f3

Meta

rustc --version --verbose:

rustc 1.76.0-nightly (a96d57bdb 2023-12-15)
binary: rustc
commit-hash: a96d57bdb6d2bb6d233d7d5aaefc2995ab99be01
commit-date: 2023-12-15
host: x86_64-unknown-linux-gnu
release: 1.76.0-nightly
LLVM version: 17.0.6
Backtrace

<backtrace>

@Kmeakin Kmeakin added the C-bug Category: This is a bug. label Dec 16, 2023
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Dec 16, 2023
@Kmeakin
Copy link
Contributor Author

Kmeakin commented Dec 16, 2023

@rustbot label I-slow

@rustbot rustbot added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Dec 16, 2023
@Kmeakin
Copy link
Contributor Author

Kmeakin commented Dec 16, 2023

Interestingly, when Enum is a c-like enum, f1 produces the same code as f3, but f2 doesnt: https://godbolt.org/z/81bGdq6xs

@Kmeakin
Copy link
Contributor Author

Kmeakin commented Dec 16, 2023

Adding a zero-sized field (like ()) even prevents f1 being optimized into f3 (https://godbolt.org/z/3dqzTfofa), but an empty field list doesn't (https://godbolt.org/z/bTK9qa9z1)

@clubby789
Copy link
Contributor

clubby789 commented Dec 16, 2023

Looks similar to #113506, which was fixed upstream.
Ah, a little different
Throwing f1 into upstream opt improves codegen, but still not as good as the optimal case in f3

The PartialEq derive uses essentially the same code as f3 because of this issue, so it would be nice to have it work here

@clubby789 clubby789 added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. and removed C-bug Category: This is a bug. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Dec 16, 2023
@Kmeakin
Copy link
Contributor Author

Kmeakin commented Dec 17, 2023

Looks similar to #113506, which was fixed upstream

The improvement I'm looking for here is removing unnecessary repeated inspections of enum discriminants, not merging duplicate RHS per se. This improvement would still be useful even if each enum variant has different data in it:

compiler explorer

pub enum Enum {
    A(u32),
    B(u64),
    C(f32),
    D(f64),
}

@clubby789
Copy link
Contributor

clubby789 commented Jan 10, 2024

Here's some more similar strange behaviour:

#[no_mangle]
pub fn f1(lhs: &Enum, rhs: &Enum) -> bool {
    match (lhs, rhs) {
        (Enum::A(lhs), Enum::A(rhs)) => true,
        (Enum::B(lhs), Enum::B(rhs)) => true,
        (Enum::C(lhs), Enum::C(rhs)) => true,
        _ => false,
    }
}

demonstrates the repeated inspections, but if we remove the unused lhs/rhs bindings (or have bindings in only one branch), we get optimal codegen:
https://godbolt.org/z/fPMx5bYz6

Seems like MIR optimisations manage to improve it to a point where LLVM can optimise it better

@DianQK
Copy link
Member

DianQK commented Feb 7, 2024

Update: In fact, we just need to re-enable the early otherwise branch optimization.


Hmm, this requires multiple optimizations to work together.

As we merge the two i32, we'll end up with better instructions. See https://llvm.godbolt.org/z/EYG46EaxY.

mov     rax, qword ptr [rsi]
cmp     rax, qword ptr [rdi]
sete    al
ret

It may be that part of the optimization fits in MIR, I'll trade off.

@rustbot claim

DianQK added a commit to DianQK/rust that referenced this issue Feb 21, 2024
DianQK added a commit to DianQK/rust that referenced this issue Feb 22, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 29, 2024
… r=<try>

 Re-enable the early otherwise branch optimization

Fixes rust-lang#95162. Fixes rust-lang#119014. Fixes rust-lang#117970.

An invalid enum discriminant can come from anywhere. We have to check to see if all successors contain the discriminant statement.

It should not be UB that we pass in an invalid enum discriminant when calling a function, this is more like LLVM's poison value. UB only when used. Although miri would consider the following code to be UB. (It's fine for miri.)

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=18602870aaeb07cbdf7dfcd2c28961a2

I extended the scenario with scalars and the same target values.

r? compiler
DianQK added a commit to DianQK/rust that referenced this issue Mar 3, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Mar 3, 2024
… r=<try>

 Re-enable the early otherwise branch optimization

Fixes rust-lang#95162. Fixes rust-lang#119014. Fixes rust-lang#117970.

An invalid enum discriminant can come from anywhere. We have to check to see if all successors contain the discriminant statement.

It should not be UB that we pass in an invalid enum discriminant when calling a function, this is more like LLVM's poison value. UB only when used. Although miri would consider the following code to be UB. (It's fine for miri.)

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=18602870aaeb07cbdf7dfcd2c28961a2

I extended the scenario with scalars and the same target values.

r? compiler
DianQK added a commit to DianQK/rust that referenced this issue Mar 12, 2024
DianQK added a commit to DianQK/rust that referenced this issue Mar 12, 2024
DianQK added a commit to DianQK/rust that referenced this issue Mar 13, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Mar 27, 2024
…nch, r=<try>

Re-enable the early otherwise branch optimization

Closes rust-lang#95162. Fixes rust-lang#119014.

This is the first part of rust-lang#121397.

An invalid enum discriminant can come from anywhere. We have to check to see if all successors contain the discriminant statement. This should have a pass to hoist instructions.

r? cjgillot
DianQK added a commit to DianQK/rust that referenced this issue Apr 7, 2024
DianQK added a commit to DianQK/rust that referenced this issue Apr 7, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Apr 7, 2024
…nch, r=cjgillot

Re-enable the early otherwise branch optimization

Closes rust-lang#95162. Fixes rust-lang#119014.

This is the first part of rust-lang#121397.

An invalid enum discriminant can come from anywhere. We have to check to see if all successors contain the discriminant statement. This should have a pass to hoist instructions.

r? cjgillot
DianQK added a commit to DianQK/rust that referenced this issue Apr 8, 2024
@bors bors closed this as completed in 59c808f Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
4 participants