Skip to content

[RISCV][AArch64] llvm has 9-12% performance degradation compared to gcc for spec2017/500.perlbench_r #121813

Open
@michaelmaitland

Description

@michaelmaitland

I've been looking into this for some time now and I wanted to file an issue hoping that (a) others are seeing the same problem, (b) we can discuss on how to close this gap, and (c) see if any other targets have some insights on prior work that may help here.

Comparison Regression (%)
LLVM No Vec vs GCC No Vec 12.57
LLVM No Vec vs GCC Vec 12.19
LLVM Vec vs GCC No Vec 9.72
LLVM Vec vs GCC Vec 9.32

It looks like there is a common scalar related regression. These numbers are at O3 with LTO enabled. I know that this regression is visible in both the qemu dynamic instruction count and on hardware. I know that it impacts both in order and out of order RISC-V cores. As per a talk at the 2021 LLVM Dev Meeting, it looks like this issue also exists on AArch64 see slide 3. I'm not sure if the regression is present on other targets.

The S_regmatch function cycle count on LLVM is far behind the cycle count on GCC. In this function, the number of dynamic stack spills and reloads is much higher (over 50% higher) on LLVM than on GCC. The static number of spills and reloads is relatively similar. In this function, the number of dynamic branches are relatively similar, but there are 34% more dynamic jumps.

The issue solved by #90819 helps close the performance gap by a few percents, but there is is a significant way to go. I have run many other experiments that have ruled out what the issue is, and could chat about them in a call or add follow up comments.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions