Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pattern matching on enum creates unecessary jump table #115742

Closed
Kmeakin opened this issue Sep 10, 2023 · 8 comments
Closed

Pattern matching on enum creates unecessary jump table #115742

Kmeakin opened this issue Sep 10, 2023 · 8 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes

Comments

@Kmeakin
Copy link
Contributor

Kmeakin commented Sep 10, 2023

I tried this code:
Say we define an AST type for expressions, each carrying a span to give its position in the original source code:

pub type ByteSpan = (u32, u32);

pub enum Expr<'core> {
    Error(ByteSpan),
    Int(ByteSpan, u64),
    Array(ByteSpan, &'core [Self]),
    Call(ByteSpan, &'core Self, &'core [Self]),
}

pub fn span(expr: &Expr) -> ByteSpan {
    match expr {
        Expr::Error(span, ..)
        | Expr::Int(span, ..)
        | Expr::Array(span, ..)
        | Expr::Call(span, ..) => *span,
    }
}

I expected to see this happen:
The ByteSpan is placed at the same offset for every variant of the enum, so the final assembly for span should be:

example::span:
        mov     eax, dword ptr [rdi + 4]
        mov     edx, dword ptr [rdi + 8]
        ret

Instead, this happened:
A jump table is generated, even though every entry in the jump table is identical (godbolt):

example::span:
        mov     eax, dword ptr [rdi]
        lea     rcx, [rip + .LJTI0_0]
        movsxd  rax, dword ptr [rcx + 4*rax]
        add     rax, rcx
        jmp     rax
.LBB0_1:
        lea     rcx, [rdi + 4]
        mov     eax, dword ptr [rdi + 4]
        mov     edx, dword ptr [rcx + 4]
        ret
.LJTI0_0:
        .long   .LBB0_1-.LJTI0_0
        .long   .LBB0_1-.LJTI0_0
        .long   .LBB0_1-.LJTI0_0
        .long   .LBB0_1-.LJTI0_0

If the last 2 variants are removed, the optimal code is produced. If only the last variant is removed, the code is better, but still suboptimal:

example::span:
        mov     eax, dword ptr [rdi]
        test    rax, rax
        je      .LBB0_2
        cmp     eax, 1
.LBB0_2:
        lea     rcx, [rdi + 4]
        mov     eax, dword ptr [rdi + 4]
        mov     edx, dword ptr [rcx + 4]
        ret

Meta

rustc --version --verbose:

rustc 1.74.0-nightly (1e746d774 2023-09-07)
binary: rustc
commit-hash: 1e746d7741d44551e9378daf13b8797322aa0b74
commit-date: 2023-09-07
host: x86_64-unknown-linux-gnu
release: 1.74.0-nightly
LLVM version: 17.0.0
Compiler returned: 0
@Kmeakin Kmeakin added the C-bug Category: This is a bug. label Sep 10, 2023
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Sep 10, 2023
@Kmeakin
Copy link
Contributor Author

Kmeakin commented Sep 10, 2023

@rustbot label I-slow

@rustbot rustbot added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Sep 10, 2023
@saethlin saethlin added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Sep 11, 2023
@dianqk
Copy link
Member

dianqk commented Sep 11, 2023

Looking at the IR form https://godbolt.org/z/G79jaedqa, I believe this is a duplicate issue of #113506.

@rustbot claim

@the8472
Copy link
Member

the8472 commented Sep 11, 2023

The ByteSpan is placed at the same offset for every variant of the enum, so the final assembly for span should be

The ordering of the variant payloads is not guaranteed to be in source order for repr(Rust) enums

@Kmeakin
Copy link
Contributor Author

Kmeakin commented Sep 12, 2023

The ByteSpan is placed at the same offset for every variant of the enum, so the final assembly for span should be

The ordering of the variant payloads is not guaranteed to be in source order for repr(Rust) enums

But in this case rustc did decide to put it at the same offset for each variant, so no jump table should be needed

@Noratrieb Noratrieb added the llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes label Sep 20, 2023
@nikic
Copy link
Contributor

nikic commented Feb 14, 2024

Not fixed by the LLVM 18 upgrade. Keeping the label though, as LLVM 19 should fix this thanks to getelementptr i8 canonicalization.

@erikdesjardins
Copy link
Contributor

Fixed in nightly by #121665: https://godbolt.org/z/rPhrxTq61

I don't think this needs a test, since tests/codegen/issues/issue-121719-common-field-offset.rs added in that PR is basically the same.

@dianqk
Copy link
Member

dianqk commented Mar 6, 2024

Fixed in nightly by #121665: https://godbolt.org/z/rPhrxTq61

I don't think this needs a test, since tests/codegen/issues/issue-121719-common-field-offset.rs added in that PR is basically the same.

Yes. InstCombine now do this transformation. It's different from #113506.

@dianqk dianqk removed their assignment Mar 6, 2024
@WaffleLapkin
Copy link
Member

Closing as per #115742 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes
Projects
None yet
Development

No branches or pull requests

9 participants