-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inline most of the code paths for conversions with boxed slices #49555
Conversation
This helps with the specific problem described in rust-lang#49541, obviously without making any large change to how inlining works in the general case. Everything involved in the conversions is made `#[inline]`, except for the `<Vec<T>>::into_boxed_slice` entry point which is made `#[inline(always)]` after checking that duplicating the function mentioned in the issue prevented its inlining if I only annotate it with `#[inline]`. For the record, that function was: ```rust pub fn foo() -> Box<[u8]> { vec![0].into_boxed_slice() } ``` To help the inliner's job, we also hoist a `self.capacity() != self.len` check in `<Vec<T>>::shrink_to_fit` and mark it as `#[inline]` too.
self.buf.shrink_to_fit(self.len); | ||
if self.capacity() != self.len { | ||
self.buf.shrink_to_fit(self.len); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that these don't need the #[inline]
attribute because they're already generic, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check and report back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this wasn't necessary. I removed it and compiled a module with foo
duplicated, here is the result of compiling that to assembly:
.section __TEXT,__text,regular,pure_instructions
.p2align 4, 0x90
__ZN55_$LT$alloc..heap..Heap$u20$as$u20$core..heap..Alloc$GT$3oom17h2c0e1b8273a8e186E:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset %rbp, -16
movq %rsp, %rbp
.cfi_def_cfa_register %rbp
callq ___rust_oom
ud2
.cfi_endproc
.p2align 4, 0x90
__ZN5alloc4heap15exchange_malloc28_$u7b$$u7b$closure$u7d$$u7d$17h1d897acb095d410dE:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset %rbp, -16
movq %rsp, %rbp
.cfi_def_cfa_register %rbp
subq $32, %rsp
movq 16(%rdi), %rax
movq %rax, -8(%rbp)
movq (%rdi), %rax
movq 8(%rdi), %rcx
movq %rcx, -16(%rbp)
movq %rax, -24(%rbp)
leaq -24(%rbp), %rdi
callq __ZN55_$LT$alloc..heap..Heap$u20$as$u20$core..heap..Alloc$GT$3oom17h2c0e1b8273a8e186E
ud2
.cfi_endproc
.globl _foo
.p2align 4, 0x90
_foo:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset %rbp, -16
movq %rsp, %rbp
.cfi_def_cfa_register %rbp
subq $48, %rsp
leaq -40(%rbp), %rdx
movl $1, %edi
movl $1, %esi
callq ___rust_alloc
testq %rax, %rax
je LBB2_1
movb $0, (%rax)
movl $1, %edx
addq $48, %rsp
popq %rbp
retq
LBB2_1:
movq -32(%rbp), %rax
movq -24(%rbp), %rcx
movq %rcx, -8(%rbp)
movq %rax, -16(%rbp)
movq -16(%rbp), %rax
movq -8(%rbp), %rcx
movq %rcx, -24(%rbp)
movq %rax, -32(%rbp)
leaq -40(%rbp), %rdi
callq __ZN5alloc4heap15exchange_malloc28_$u7b$$u7b$closure$u7d$$u7d$17h1d897acb095d410dE
ud2
.cfi_endproc
.globl _bar
.p2align 4, 0x90
_bar:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset %rbp, -16
movq %rsp, %rbp
.cfi_def_cfa_register %rbp
subq $48, %rsp
leaq -40(%rbp), %rdx
movl $1, %edi
movl $1, %esi
callq ___rust_alloc
testq %rax, %rax
je LBB3_1
movb $0, (%rax)
movl $1, %edx
addq $48, %rsp
popq %rbp
retq
LBB3_1:
movq -32(%rbp), %rax
movq -24(%rbp), %rcx
movq %rcx, -8(%rbp)
movq %rax, -16(%rbp)
movq -16(%rbp), %rax
movq -8(%rbp), %rcx
movq %rcx, -24(%rbp)
movq %rax, -32(%rbp)
leaq -40(%rbp), %rdi
callq __ZN5alloc4heap15exchange_malloc28_$u7b$$u7b$closure$u7d$$u7d$17h1d897acb095d410dE
ud2
.cfi_endproc
.subsections_via_symbols
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was the code change here also still necessary to get the same size reductions?
I'm personally ok on adding |
src/liballoc/vec.rs
Outdated
@@ -636,6 +638,7 @@ impl<T> Vec<T> { | |||
/// assert_eq!(slice.into_vec().capacity(), 3); | |||
/// ``` | |||
#[stable(feature = "rust1", since = "1.0.0")] | |||
#[inline(always)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH, removing this one makes the into_boxed_slice
calls not be optimised anymore when the function is duplicated. Even just #[inline]
makes rustc not inline the calls anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm so in general #[inline(always)]
is an antipattern as it can cause compile time to baloon very quickly. Mind if I poke around a bit at the examples you were looking at to see what's going on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
#![crate_type = "lib"]
#[no_mangle]
pub fn foo() -> Box<[u8]> {
vec![0].into_boxed_slice()
}
#[no_mangle]
pub fn bar() -> Box<[u8]> {
vec![0].into_boxed_slice()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm ok I'm tempted though to leave this as-is and let LLVM decide these sorts of things, we've had bad experiences in the past optimizing too much for microbenchmarks unfortunately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean this specific #[inline(always)]
attribute, or the whole PR? Either is fine with me, to be heard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh just this one particular instance of the attribute. The function is generic so it's already candidate to be inlined everywhere, and it sounds like #[inline]
doesn't otherwise work here so #[inline(always)]
is going directly against LLVM's heuristics, which has caused a lot of problems for us historically
@bors: r+ rollup Thanks! |
📌 Commit b59fa0d has been approved by |
Inline most of the code paths for conversions with boxed slices This helps with the specific problem described in rust-lang#49541, obviously without making any large change to how inlining works in the general case. Everything involved in the conversions is made `#[inline]`, except for the `<Vec<T>>::into_boxed_slice` entry point which is made `#[inline(always)]` after checking that duplicating the function mentioned in the issue prevented its inlining if I only annotate it with `#[inline]`. For the record, that function was: ```rust pub fn foo() -> Box<[u8]> { vec![0].into_boxed_slice() } ``` To help the inliner's job, we also hoist a `self.capacity() != self.len` check in `<Vec<T>>::shrink_to_fit` and mark it as `#[inline]` too.
Rollup of 8 pull requests Successful merges: - #49555 (Inline most of the code paths for conversions with boxed slices) - #49606 (Prevent broken pipes causing ICEs) - #49646 (Use box syntax instead of Box::new in Mutex::remutex on Windows) - #49647 (Remove `underscore_lifetimes` and `match_default_bindings` from active feature list) - #49931 (Fix incorrect span in `&mut` suggestion) - #49959 (rustbuild: allow building tools with debuginfo) - #49965 (Remove warning about f64->f32 cast being potential UB) - #49994 (Remove unnecessary indentation in rustdoc book codeblock.) Failed merges:
This helps with the specific problem described in #49541, obviously without making any large change to how inlining works in the general case.
Everything involved in the conversions is made
#[inline]
, except for the<Vec<T>>::into_boxed_slice
entry point which is made#[inline(always)]
after checking that duplicating the function mentioned in the issue prevented its inlining if I only annotate it with#[inline]
.For the record, that function was:
To help the inliner's job, we also hoist a
self.capacity() != self.len
check in<Vec<T>>::shrink_to_fit
and mark it as#[inline]
too.