Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could the compiler automatically re-use Box allocations as an optimization? #93707

Open
scottmcm opened this issue Feb 6, 2022 · 4 comments
Open
Labels
A-box Area: Our favorite opsem complication A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-mir-opt Area: MIR optimizations C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@scottmcm
Copy link
Member

scottmcm commented Feb 6, 2022

I tried this code: https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=f7612dfa44bc15ef44bd7d759dce1b87

pub fn demo(x: Box<i32>) -> Box<f32> {
    let i = *x;
    drop(x);
    let f = i as f32;
    Box::new(f)
}

Given that i32 and f32 have the same layout (at least on my target) I was hoping that the compiler would be able to -- as an optimization -- avoid the dealloc+alloc dance here.

But it looks like, right now at least, it doesn't re-use it even when the optimizer puts the __rust_dealloc+__rust_alloc right next to each other with matching layout arguments. Obviously this wouldn't be guaranteed, but hopefully in a bunch of simple cases it could just work.

; playground::demo
; Function Attrs: nounwind nonlazybind uwtable
define noalias nonnull align 4 float* @_ZN10playground4demo17h6a3fa9b52448831aE(i32* noalias nonnull align 4 %x) unnamed_addr #0 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality {
start:
  %i = load i32, i32* %x, align 4
  %_2.i.i.i.i = bitcast i32* %x to i8*
  tail call void @__rust_dealloc(i8* nonnull %_2.i.i.i.i, i64 4, i64 4) #4      // <-- HERE
  %0 = tail call dereferenceable_or_null(4) i8* @__rust_alloc(i64 4, i64 4) #4  // <-- HERE
  %1 = icmp eq i8* %0, null
  br i1 %1, label %bb3.i.i, label %"_ZN5alloc5boxed12Box$LT$T$GT$3new17hdd831c84a3ba3a35E.exit"

bb3.i.i:                                          ; preds = %start
; call alloc::alloc::handle_alloc_error
  tail call void @_ZN5alloc5alloc18handle_alloc_error17he10e441498789810E(i64 4, i64 4) #5
  unreachable

"_ZN5alloc5boxed12Box$LT$T$GT$3new17hdd831c84a3ba3a35E.exit": ; preds = %start
  %f = sitofp i32 %i to float
  %2 = bitcast i8* %0 to float*
  store float %f, float* %2, align 4
  ret float* %2
}

cc #93653, about APIs to do this manually

@scottmcm scottmcm added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Feb 6, 2022
@Kixunil
Copy link
Contributor

Kixunil commented Feb 6, 2022

I repeatedly hear people saying that they need some optimizations to be guaranteed, so a guaranteed solutions would be nicer. I had some random thoughts about defining some basic container operations as traits in core::ops and then the compiler would use them in some scenarios. May be related to hypothetical DerefMove

@scottmcm
Copy link
Member Author

scottmcm commented Feb 6, 2022

I don't see this as XOR, @Kixunil. The conversation about APIs that guarantee it can continue in other items, while still having optimizations to do it opportunistically.

We already do some optimizations around allocations, like

pub fn alloc_test(data: u32) {

@nikic nikic added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Feb 18, 2022
@nikic
Copy link
Contributor

nikic commented Feb 18, 2022

LLVM doesn't currently have a notion of sized deallocations, which would be necessary for this transform.

@Lancern
Copy link

Lancern commented Feb 18, 2022

LLVM doesn't currently have a notion of sized deallocations, which would be necessary for this transform.

Maybe this optimization can be done on the MIR.

@Noratrieb Noratrieb added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Apr 5, 2023
@workingjubilee workingjubilee added the A-box Area: Our favorite opsem complication label Oct 1, 2024
@Enselic Enselic added A-mir-opt Area: MIR optimizations C-discussion Category: Discussion or questions that doesn't represent real issues. labels Dec 27, 2024
@Noratrieb Noratrieb added C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such and removed C-discussion Category: Discussion or questions that doesn't represent real issues. labels Mar 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-box Area: Our favorite opsem complication A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-mir-opt Area: MIR optimizations C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants