Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function gets optimized away unless marked #[inline(never)] pub fn #122516

Closed
spcan opened this issue Mar 14, 2024 · 9 comments
Closed

Function gets optimized away unless marked #[inline(never)] pub fn #122516

spcan opened this issue Mar 14, 2024 · 9 comments
Labels
C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@spcan
Copy link

spcan commented Mar 14, 2024

I am using this function to decode gray coded u64 into decimal values (also u64).

fn decimal(mut gray: u64) -> u64 {
    // Copy the gray value.
    let mut temp = gray;

    while temp != 0 {
        // Shift right temp.
        temp >>= 1;

        // XOR the gray value.
        gray ^= temp;
    }

    gray
}

Target architecture is thumbv8.main-none-eabihf with RUSTFLAGS:

rustflags = [
    "-C", "link-arg=--nmagic",
    "-C", "link-arg=-Tlink.x",
    #"-C", "link-arg=-Tmemory.x",
    "-C", "link-arg=-Tdefmt.x",

    # Code optimizations
    #"-Z", "trap-unreachable=no",
    #"-C", "inline-threshold=5",
    #"-C", "no-vectorize-loops",

    # Fixes LLVM bug on function outlining
    "-C", "llvm-args=--enable-machine-outliner=never",
]

The resulting code should be a function that transforms the input gray code into a decimal. Instead, the function gets optimized to directly return the input u64. The final assembly of this function is:

00001190 <_ZN7lpc55006timers7ostimer7decimal17h543acbca5175c913E>:
    1190:	4770      	bx	lr

To make the code correctly appear, the function must be marked with #[inline(never)], #[no_mangle] and extern "Rust". Additionally, if the function is a member of a struct impl not matter what it always get optimized away.

Correct function signature:

#[inline(never)]
#[no_mangle]
extern "Rust" fn decimal(mut gray: u64) -> u64 {}

Correct assembly:

00001190 <decimal>:
    1190:	ea50 0201 	orrs.w	r2, r0, r1
    1194:	bf02      	ittt	eq
    1196:	2000      	moveq	r0, #0
    1198:	2100      	moveq	r1, #0
    119a:	4770      	bxeq	lr
    119c:	b5d0      	push	{r4, r6, r7, lr}
    119e:	af02      	add	r7, sp, #8
    11a0:	4604      	mov	r4, r0
    11a2:	460b      	mov	r3, r1
    11a4:	ea5f 0c53 	movs.w	ip, r3, lsr #1
    11a8:	ea81 010c 	eor.w	r1, r1, ip
    11ac:	ea4f 0e34 	mov.w	lr, r4, rrx
    11b0:	ea80 000e 	eor.w	r0, r0, lr
    11b4:	ea5e 020c 	orrs.w	r2, lr, ip
    11b8:	d01d      	beq.n	11f6 <decimal+0x66>
    11ba:	08a2      	lsrs	r2, r4, #2
    11bc:	ea81 0193 	eor.w	r1, r1, r3, lsr #2
    11c0:	ea42 7283 	orr.w	r2, r2, r3, lsl #30
    11c4:	4050      	eors	r0, r2
    11c6:	ea52 0293 	orrs.w	r2, r2, r3, lsr #2
    11ca:	bf1f      	itttt	ne
    11cc:	ea81 01d3 	eorne.w	r1, r1, r3, lsr #3
    11d0:	08e2      	lsrne	r2, r4, #3
    11d2:	ea42 7243 	orrne.w	r2, r2, r3, lsl #29
    11d6:	4050      	eorne	r0, r2
    11d8:	bf18      	it	ne
    11da:	ea52 02d3 	orrsne.w	r2, r2, r3, lsr #3
    11de:	d00a      	beq.n	11f6 <decimal+0x66>
    11e0:	0922      	lsrs	r2, r4, #4
    11e2:	ea42 7403 	orr.w	r4, r2, r3, lsl #28
    11e6:	ea81 1113 	eor.w	r1, r1, r3, lsr #4
    11ea:	4060      	eors	r0, r4
    11ec:	ea54 1213 	orrs.w	r2, r4, r3, lsr #4
    11f0:	ea4f 1313 	mov.w	r3, r3, lsr #4
    11f4:	d1d6      	bne.n	11a4 <decimal+0x14>
    11f6:	bdd0      	pop	{r4, r6, r7, pc}

Meta

rustc --version --verbose:

rustc 1.78.0-nightly (3cbb93223 2024-03-13)
binary: rustc
commit-hash: 3cbb93223f33024db464a4df27a13c7cce870173
commit-date: 2024-03-13
host: x86_64-unknown-linux-gnu
release: 1.78.0-nightly
LLVM version: 18.1.0
@spcan spcan added the C-bug Category: This is a bug. label Mar 14, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Mar 14, 2024
@jieyouxu jieyouxu added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. S-has-mcve Status: A Minimal Complete and Verifiable Example has been found for this issue labels Mar 14, 2024
@workingjubilee
Copy link
Member

@spcan Can you explain the specific problem? Is this causing trouble for you when actually producing a binary based on this library code? What cargo commands were you using, and what is the rest of your Cargo.toml like? Rust code adhering to the Rust ABI is not necessarily promising to reveal a legible public interface to arbitrary external code unless given the kind of coaxing you have, as the language does rely heavily on inlining for optimizations, and so generally when building your code, we kind of assume you want to not have public interfaces unless you... declare them public.

And all that's needed, really, is declaring the relevant types and functions pub fn and then #[inline(never)], as I have done in this Godbolt. As you can see, the identifiers are mangled, so there's that problem, and #[no_mangle] is recommended if you want other languages to be able to jump to these easily.

@workingjubilee workingjubilee changed the title Function gets incorrectly optimized away unless marked #[no_mangle], #[inline(never)] and extern "Rust" Function gets optimized away unless marked #[inline(never)] pub fn Mar 14, 2024
@spcan
Copy link
Author

spcan commented Mar 14, 2024

The main problem is that the code should never be optimized away because it's operating on the input. The full gray code decoding algorithm gets removed and substituted by a single bx lr instruction (which is equivalent in Rust to a function that only contains a return statement). Even when the compiler inlines the decimal function, it was inlined as a return statement, not as the instructions it should have.

The assembly of the function that calls decimal is:

00001200 <_ZN7lpc55006timers7ostimer6Reader4read17h00ffcc431ae36bd1E>:
    1200:	b5d0      	push	{r4, r6, r7, lr}
    1202:	af02      	add	r7, sp, #8
    1204:	f24d 0000 	movw	r0, #53248	; 0xd000
    1208:	b672      	cpsid	i
    120a:	f2c4 0002 	movt	r0, #16386	; 0x4002
    120e:	6804      	ldr	r4, [r0, #0]
    1210:	6843      	ldr	r3, [r0, #4]
    1212:	b662      	cpsie	i
    1214:	f36f 235f 	bfc	r3, #9, #23
    1216:	f000 b800 	b.w	1218 <decimal>

Then the decimal function only contained the return statement equivalent.

The #[inline(never)], #[no_mangle] and extern "Rust" were shots in the dark to try and force the compiler to not optimize the code away.

@pacak
Copy link
Contributor

pacak commented Mar 14, 2024

How are you using that function when it gets optimized away?

@spcan
Copy link
Author

spcan commented Mar 14, 2024

Removing any of #[no_mangle], #[inline(never)] or the extern it gets optimized away.

@pacak
Copy link
Contributor

pacak commented Mar 14, 2024

  1. With this Automatically enable cross-crate inlining for small functions #116505 if it's a small function it will be optimized away in a crate when compiled as a library. It should still be compiled correctly if you start using it in a binary against non constant input.
  2. It might be optimized into something trivial (bx lr) if you are using it against a constant input.

Do you have any examples where this optimizing away means your rust code starts producing invalid values?

@jieyouxu jieyouxu removed the S-has-mcve Status: A Minimal Complete and Verifiable Example has been found for this issue label Mar 14, 2024
@spcan
Copy link
Author

spcan commented Mar 14, 2024

The input to that decimal function is a gray coded monotonic timer so it will always take a different value. The u64 is read via two ldr (code lines 120e and 1210) instructions generated with a core::ptr::read_volatile call.

@spcan
Copy link
Author

spcan commented Mar 14, 2024

Doing a cargo clean on the repository seems to have removed this error?????

@spcan
Copy link
Author

spcan commented Mar 14, 2024

I'm using nightly, so maybe it was doing something weird with a previously compiled dependency? IDK, seems to be fixed for now

@spcan
Copy link
Author

spcan commented Mar 14, 2024

Closing this as the error does not appear to be occurring anymore. Will reopen if it reappears

@spcan spcan closed this as completed Mar 14, 2024
@jieyouxu jieyouxu removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants