opt-level=0 does not do constant folding #136366

allanwmacdonald · 2025-01-31T20:10:12Z

Consider this code:

  let _blah = 1 << (5 + 2);

When inspecting the assembly (from gdb), I expected to see this happen:

   0x000055555555b740 <+0>:	movl   $0x80,-0x4(%rsp)

4	}
=> 0x000055555555b748 <+8>:	ret

Instead, this happened:

3	  let _blah = 1 << (5 + 2);
=> 0x000055555555b7c1 <+1>:	mov    $0x5,%eax
   0x000055555555b7c6 <+6>:	add    $0x2,%eax
   0x000055555555b7c9 <+9>:	mov    %eax,(%rsp)
   0x000055555555b7cc <+12>:	seto   %al
   0x000055555555b7cf <+15>:	jo     0x55555555b7db <_ZN9debugrust4main17h1db24d82a646b36bE+27>
   0x000055555555b7d1 <+17>:	mov    (%rsp),%eax
   0x000055555555b7d4 <+20>:	cmp    $0x20,%eax
   0x000055555555b7d7 <+23>:	jb     0x55555555b7eb <_ZN9debugrust4main17h1db24d82a646b36bE+43>
   0x000055555555b7d9 <+25>:	jmp    0x55555555b7fe <_ZN9debugrust4main17h1db24d82a646b36bE+62>
   0x000055555555b7db <+27>:	lea    0x4c87e(%rip),%rdi        # 0x5555555a8060
   0x000055555555b7e2 <+34>:	lea    -0x2b9(%rip),%rax        # 0x55555555b530 <_ZN4core9panicking11panic_const24panic_const_add_overflow17h5a5f82b06563d133E>
   0x000055555555b7e9 <+41>:	call   *%rax
   0x000055555555b7eb <+43>:	mov    (%rsp),%ecx
--Type <RET> for more, q to quit, c to continue without paging--c
   0x000055555555b7ee <+46>:	and    $0x1f,%ecx
   0x000055555555b7f1 <+49>:	mov    $0x1,%eax
   0x000055555555b7f6 <+54>:	shl    %cl,%eax
   0x000055555555b7f8 <+56>:	mov    %eax,0x4(%rsp)
   0x000055555555b7fe <+62>:	lea    0x4c873(%rip),%rdi        # 0x5555555a8078
   0x000055555555b805 <+69>:	lea    -0x29c(%rip),%rax        # 0x55555555b570 <_ZN4core9panicking11panic_const24panic_const_shl_overflow17hfdbee9592709de18E>
   0x000055555555b80c <+76>:	call   *%rax

This is the compiler actually generating code that computes the value of 1 << (5 + 2) at run time, along with a whole bunch of stuff that appears to relate to the safety of performing such an operation on variable data at run time.

This is unexpected. The entire expression on the right side of the assignment statement, 1 << (5 + 2), involves no variables. This can be safely resolved at compile time.

Strangely, the following code,

  let _blah = 1 << 7;

actually produces the expected result. This isn't consistent behaviour. Why is this expression, 1 << 7, resolved at compile time while 1 << (5 + 2) is not?

If I set the optimization level to 1 in the Cargo.toml as follows:

[profile.dev]
opt-level = 1

The expected code is produced with either expression (as long as the variable is used later, of course). However, I do not see how this is thought of as an optimization; optimization is supposed to be applied to executed code that involves actual data in variables or references. Since both 1 << (5 + 2) and 1 << 7 are literal expressions that resolve to the same identical value, and they do not involve data, they do not need to be "optimized". Each expression ought to be treated as a single entity and treated the same.

Rationale:
I suppose that this will be viewed as trivial in the context of a PC or server environment. However, in an embedded environment, the additional unnecessary code may result in consumption of valuable resources, if optimization must be disabled in order to perform debugging activites on an embedded system. This might even cause the unoptimized code to not compile at all on a sytem with tight memory constraints. Therefore, it is desirable, and in my opinion, a show stopper, that evaluation of literal expressions ought to be resolved at compile time regardless of optimization level.

Meta

rustc --version --verbose:

rustc 1.84.0 (9fc6b4312 2025-01-07)
binary: rustc
commit-hash: 9fc6b43126469e3858e2fe86cafb4f0fd5068869
commit-date: 2025-01-07
host: x86_64-unknown-linux-gnu
release: 1.84.0
LLVM version: 19.1.5

Backtrace

    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.01s

The text was updated successfully, but these errors were encountered:

allanwmacdonald · 2025-01-31T21:11:49Z

Additional info: A google search of the term "constant folding" results in several articles that describe the process that calculates the literal expression and replaces the expression with a single value, the behavour I am advocating here. Some of these articles even state that this is a "compiler optimization". However, gcc, by default, applies this process even when the optimization level is set to none; therefore, it seems to me that the gcc developers considered this to be a desirable feature that doesn't need an option to invoke. gcc does have a means to inhibit constant folding for floating point numbers, in particular, with the option -frounding-math.

workingjubilee · 2025-01-31T21:15:07Z

The semantics of these values is that they are indeed potentially evaluated at runtime.

Consider instead https://rust.godbolt.org/z/637djG7cd which uses the following code:

#[inline(never)]
pub fn whatever() -> i32 {
    let _blah = const { 1 << (5 + 2) };
    _blah
}

The use of the const {} forces the value to be evaluated at compile time.

I am not entirely sure why 1 << (5+2) is not folded but 1 << 7 is, the difference in the MIR is not so great, but the LLVMIR we emit has already had 1 << 7 computed in it.

ShE3py · 2025-01-31T21:45:25Z

The expression is correctly folded with overflow checks disabled;
https://rust.godbolt.org/z/6c1x3sae4

Looks like overflow checks require opt-level ⩾ 1 to be removed, so it prevents constant folding?
E.g. 1 + 2 isn't folded as 3, and n << m is folded as it can't overflow.

Seems like a regression from stable to stable;
https://rust.godbolt.org/z/cj16n7hEz

@rustbot label +A-codegen +C-optimization +I-heavy +T-compiler

workingjubilee · 2025-01-31T21:46:25Z

Oh, thank you for identifying this is a regression! I was wondering.

workingjubilee · 2025-01-31T22:06:06Z

searched toolchains nightly-2023-03-04 through nightly-2023-05-27


********************************************************************************
Regression in nightly-2023-04-16
********************************************************************************

fetching https://static.rust-lang.org/dist/2023-04-15/channel-rust-nightly-git-commit-hash.txt
nightly manifest 2023-04-15: 40 B / 40 B [=============================================================================================================================================================================================================================================================] 100.00 % 2.29 MB/s converted 2023-04-15 to 84dd17b56a931a631a23dfd5ef2018fd3ef49108
fetching https://static.rust-lang.org/dist/2023-04-16/channel-rust-nightly-git-commit-hash.txt
nightly manifest 2023-04-16: 40 B / 40 B [=============================================================================================================================================================================================================================================================] 100.00 % 3.09 MB/s converted 2023-04-16 to 5cdb7886a5ece816864fab177f0c266ad4dd5358
looking for regression commit between 2023-04-15 and 2023-04-16
fetching (via remote github) commits from max(84dd17b56a931a631a23dfd5ef2018fd3ef49108, 2023-04-13) to 5cdb7886a5ece816864fab177f0c266ad4dd5358
ending github query because we found starting sha: 84dd17b56a931a631a23dfd5ef2018fd3ef49108
get_commits_between returning commits, len: 10
  commit[0] 2023-04-14: Auto merge of #110331 - matthiaskrgr:rollup-9vldvow, r=matthiaskrgr
  commit[1] 2023-04-14: Auto merge of #110197 - cjgillot:codegen-discr, r=pnkfelix
  commit[2] 2023-04-15: Auto merge of #110142 - Mark-Simulacrum:reduce-core-counts, r=pietroalbini
  commit[3] 2023-04-15: Auto merge of #109802 - notriddle:notriddle/rustdoc-search-generics-nested, r=GuillaumeGomez
  commit[4] 2023-04-15: Auto merge of #110335 - asomers:rust-gdb-freebsd, r=jyn514
  commit[5] 2023-04-15: Auto merge of #109900 - cjgillot:disable-const-prop, r=oli-obk
  commit[6] 2023-04-15: Auto merge of #110323 - lcnr:dropck-uwu, r=compiler-errors
  commit[7] 2023-04-15: Auto merge of #110349 - rust-lang:pa-bump-1.71.0, r=pietroalbini
  commit[8] 2023-04-15: Auto merge of #110227 - klensy:bs-win, r=Mark-Simulacrum
  commit[9] 2023-04-15: Auto merge of #110361 - ehuss:disable-jobserver-error, r=Mark-Simulacrum

This was probably due to the following PR which changed const prop to happen at higher MIR opt levels: #109900

workingjubilee · 2025-01-31T22:10:54Z

Hmm. Also noting #106285 which seems related.

cc @cjgillot

Noratrieb · 2025-02-01T12:06:29Z

If this sort of regression is a problem for you, you should definitely be using opt-level = 1. This use case is not a consideration for opt level 0 (and I don't think it should be).
opt level 0 is mostly guided by compile times. If this optimization benefits compile times we could do it, if it doesn't we won't.
The PR that regressed this had a positive compile time impact, so I will classify this as intentional.

aDotInTheVoid · 2025-02-03T00:33:04Z

Closing as not-a-bug. At opt-level=0, the compiler doesn't care about binary size (or runtime performance).

allanwmacdonald · 2025-02-03T14:12:10Z

This is an unfortunate decision in my opinion; I admit, I know little about actual compiler design but I don't understand how interpreting a constant expression and producing a single value (an int in the example) can result in an increase in compile times compared to the amount of object code generated by the compiler to produce a computation result that is run on the target. This is especially important in the embedded field where debugging on an embedded target with limited flash memory may become more difficult if target code size suddenly blooms when setting opt-level=0 to ensure breakpoints are able to land on the correct line of source code.

Noratrieb · 2025-02-03T14:50:47Z

The intuition for why such passes can hurt compile times is that they do a lot of extra checks for all the cases where you don't have such a case.
That said, a more minimal constant folding pass could potentially be a compile time win. If someone wants to implement that and it's good for compile times that sounds good.

for debugging: does opt-level 1 have an insufficient debugging experience for you?

allanwmacdonald · 2025-02-03T15:02:27Z

@Noratrieb: To be honest, I don't know, considering the very limited introduction I've had with this language and compiler. Nevertheless, in my over 25 years experience with a certain commercial embedded compiler used by my employer, I have encountered many instances where where optimization has interfered with useful debugging and disabling optimization for even single compilation units can be enough to cause the link to fail due to insufficient memory.

workingjubilee · 2025-02-05T16:35:26Z

This is an unfortunate decision in my opinion; I admit, I know little about actual compiler design but I don't understand how interpreting a constant expression and producing a single value (an int in the example) can result in an increase in compile times compared to the amount of object code generated by the compiler to produce a computation result that is run on the target.

Simply put: Merely generating code is very fast. It is optimization and reasoning about code in more complicated ways that takes more time.

Disabling optimization causing linkage to fail sounds very weird, to me.

Due to the Rust compiler having a somewhat... legalistic approach to code generation, at times, -Copt-level=1 is often used with software written in Rust that people intend to debug. In a language with potentially a lot of layers of abstraction in a program, removing only some of them still leaves enough code to reason about what is going on. Indeed, because idiomatic Rust tends to use abstractions that are then optimized away, sometimes it's easier to see what is going on after enabling that low level of optimization.

allanwmacdonald added the C-bug Category: This is a bug. label Jan 31, 2025

rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jan 31, 2025

workingjubilee added the regression-from-stable-to-stable Performance or correctness regression from one stable version to another. label Jan 31, 2025

rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jan 31, 2025

workingjubilee removed C-bug Category: This is a bug. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Jan 31, 2025

Noratrieb removed regression-from-stable-to-stable Performance or correctness regression from one stable version to another. I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Feb 1, 2025

Noratrieb changed the title ~~Literal Expressions are Not Resolved at Compile Time~~ opt-level=0 does not do constant folding Feb 1, 2025

aDotInTheVoid closed this as not planned Won't fix, can't repro, duplicate, stale Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opt-level=0 does not do constant folding #136366

opt-level=0 does not do constant folding #136366

allanwmacdonald commented Jan 31, 2025

allanwmacdonald commented Jan 31, 2025

workingjubilee commented Jan 31, 2025

ShE3py commented Jan 31, 2025 •

edited

Loading

workingjubilee commented Jan 31, 2025

workingjubilee commented Jan 31, 2025

workingjubilee commented Jan 31, 2025

Noratrieb commented Feb 1, 2025

aDotInTheVoid commented Feb 3, 2025 •

edited

Loading

allanwmacdonald commented Feb 3, 2025 •

edited

Loading

Noratrieb commented Feb 3, 2025 •

edited

Loading

allanwmacdonald commented Feb 3, 2025

workingjubilee commented Feb 5, 2025

opt-level=0 does not do constant folding #136366

opt-level=0 does not do constant folding #136366

Comments

allanwmacdonald commented Jan 31, 2025

Meta

allanwmacdonald commented Jan 31, 2025

workingjubilee commented Jan 31, 2025

ShE3py commented Jan 31, 2025 • edited Loading

workingjubilee commented Jan 31, 2025

workingjubilee commented Jan 31, 2025

workingjubilee commented Jan 31, 2025

Noratrieb commented Feb 1, 2025

aDotInTheVoid commented Feb 3, 2025 • edited Loading

allanwmacdonald commented Feb 3, 2025 • edited Loading

Noratrieb commented Feb 3, 2025 • edited Loading

allanwmacdonald commented Feb 3, 2025

workingjubilee commented Feb 5, 2025

ShE3py commented Jan 31, 2025 •

edited

Loading

aDotInTheVoid commented Feb 3, 2025 •

edited

Loading

allanwmacdonald commented Feb 3, 2025 •

edited

Loading

Noratrieb commented Feb 3, 2025 •

edited

Loading