Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AVR] Optimize 'call' to 'rcall' for short programs #54508

Closed
ghost opened this issue Mar 23, 2022 · 9 comments
Closed

[AVR] Optimize 'call' to 'rcall' for short programs #54508

ghost opened this issue Mar 23, 2022 · 9 comments
Assignees
Labels
backend:AVR clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl'

Comments

@ghost
Copy link

ghost commented Mar 23, 2022

As a possible optimization, we could look into using rcall instead of call instructions when the target is close enough. For example, here: https://godbolt.org/z/rEz9j71dq (apparently avr-gcc doesn't do this optimization).

int foo(int a, int b) {
    return a + b;
}

int bar(int a, int b) {
    return foo(a, b) + 3;
}

If -ffunction-sections is not used, rcall is both shorter in code size and faster in execution speed.

@benshi001
Copy link
Member

benshi001 commented Jun 16, 2022

We need to investigate this can be done by linker relaxation.

@benshi001 benshi001 assigned benshi001 and unassigned benshi001 Aug 15, 2022
@benshi001
Copy link
Member

What's more, we can check if other relax optimization can be done in lld for AVR.

@sprintersb
Copy link

sprintersb commented Nov 7, 2022

apparently avr-gcc doesn't do this optimization

Link with -mrelax, which performs other optimizations, too. Notice there are cases / sections that must not be optimized like .vectors or .jumptables.

Also if you are relaxing, the assembler must not relax by itself, and all relaxations must be postponed until link.

@benshi001
Copy link
Member

apparently avr-gcc doesn't do this optimization

Link with -mrelax, which performs other optimizations, too. Notice there are cases / sections that must not be optimized like .vectors or .jumptables.

Also if you are relaxing, the assembler must not relax by itself, and all relaxations must be postponed until link.

Sure. Thanks.

@benshi001
Copy link
Member

It seems impossible to do this call -> rcall transform with clang + gnu-avr-ld.

@sprintersb
Copy link

What's the problem? All that has to be done is to pass -mrelax to ld.

@benshi001
Copy link
Member

What's the problem? All that has to be done is to pass -mrelax to ld.

Sure. It is my mistake. clang's AVR driver does not handle -mrelax/-mno-relax properly.

@benshi001
Copy link
Member

@EugeneZelenko EugeneZelenko added the clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' label Feb 23, 2023
@llvmbot
Copy link
Member

llvmbot commented Feb 23, 2023

@llvm/issue-subscribers-clang-driver

benshi001 added a commit that referenced this issue Feb 24, 2023
This is in accordance with avr-gcc, even '-mno-relax' is specified
to avr-gcc, this flag will also be added to the output relocatables.

With this flag set, the GNU ld will perform long call -> short call
optimization for AVR, otherwise not.

Fixes #54508

Reviewed By: MaskRay, jacquesguan, aykevl

Differential Revision: https://reviews.llvm.org/D144617
CarlosAlbertoEnciso pushed a commit to SNSystems/llvm-debuginfo-analyzer that referenced this issue Feb 24, 2023
CarlosAlbertoEnciso pushed a commit to SNSystems/llvm-debuginfo-analyzer that referenced this issue Feb 24, 2023
This is in accordance with avr-gcc, even '-mno-relax' is specified
to avr-gcc, this flag will also be added to the output relocatables.

With this flag set, the GNU ld will perform long call -> short call
optimization for AVR, otherwise not.

Fixes llvm/llvm-project#54508

Reviewed By: MaskRay, jacquesguan, aykevl

Differential Revision: https://reviews.llvm.org/D144617
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AVR clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl'
Projects
None yet
Development

No branches or pull requests

4 participants