-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take advantage of known-valid-align in layout.rs #99136
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
(rust-highfive has picked a reviewer for you, use r? to override) |
Whoops, forgot an |
library/core/src/alloc/layout.rs
Outdated
if size > isize::MAX as usize - (align.as_nonzero().get() - 1) { | ||
return Err(LayoutError); | ||
} | ||
// SAFTEY: as above, this check is sufficient. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pondering: does this need to be duplicated?
Could from_size_align
become something like
pub const fn from_size_align(size: usize, align: usize) -> Result<Self, LayoutError> {
let align = ValidAlign::try_from(align).map_err(|_| LayoutError)?;
from_size_valid_align(size, align)
}
to avoid the repetition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be correct, but for the time being I'm erring on the side of making small changes so there's a chance of saying whether a change is positive or not.
Let's ask perf, as the comment says! @bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 079d3eb with merge bb4f8945315c007e9dddba9b61c64f37eeb201c1... |
☀️ Try build successful - checks-actions |
Queued bb4f8945315c007e9dddba9b61c64f37eeb201c1 with parent e1b348f, future comparison URL. |
Finished benchmarking commit (bb4f8945315c007e9dddba9b61c64f37eeb201c1): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Footnotes |
Not surprised, but this shows as an improvement on ctfe stress tests and neutral everywhere else. That makes it potentially worth it to merge. |
@CAD97 If this had recovered a bunch of the primary regressions from 95295 I'd have just merged it immediately, but since it's a more nuanced win, would you mind trying the deduplicated version? I'd rather not have to repeat the code if we didn't have to, and the deduplicated version should still save CTFE work by not having to check the align. |
The CTFE perf win is probably spurious, it's been flaky lately. For example, the cachegrind diff of ctfe-stress-5 check full from this PR looks like:
And the cachegrind diff of ctfe-stress-5 check full from #97841 (comment), a PR that shouldn't have impacted CTFE at all (since it just inlines a non-const Windows-only function), is pretty close to the opposite of that:
|
This comment has been minimized.
This comment has been minimized.
1c37e0e
to
2d84703
Compare
(sigill was me messing up and inserting a wrong |
☀️ Try build successful - checks-actions |
Queued 95ff54ef0b0a45df48b6f4150236ce2faa6c4ae4 with parent 38b7215, future comparison URL. |
Finished benchmarking commit (95ff54ef0b0a45df48b6f4150236ce2faa6c4ae4): comparison url. Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Footnotes |
So this is pretty clearly perf-nuetral. To that end, it seems better to just use the safe constructor everywhere, but I'm going to hold back from making that call. At a minimum, having the smaller codepath might make it easier to get other perf gains, but honestly who knows. Intuition doesn't work in this file. |
I think it absolutely makes sense to do this -- I might not have wanted to do it for a new @bors r+ rollup=iffy (perf came back neutral, but it's still a file that can be highly impactful) |
☀️ Test successful - checks-actions |
Finished benchmarking commit (87588a2): comparison url. Instruction count
Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression Footnotes |
…oshtriplett Reoptimize layout array This way it's one check instead of two, so hopefully (cc rust-lang#99117) it'll be simpler for rustc perf too 🤞 Quick demonstration: ```rust pub fn demo(n: usize) -> Option<Layout> { Layout::array::<i32>(n).ok() } ``` Nightly: <https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=e97bf33508aa03f38968101cdeb5322d> ```nasm mov rax, rdi mov ecx, 4 mul rcx seto cl movabs rdx, 9223372036854775805 xor esi, esi cmp rax, rdx setb sil shl rsi, 2 xor edx, edx test cl, cl cmove rdx, rsi ret ``` This PR (note no `mul`, in addition to being much shorter): ```nasm xor edx, edx lea rax, [4*rcx] shr rcx, 61 sete dl shl rdx, 2 ret ``` This is built atop `@CAD97` 's rust-lang#99136; the new changes are cb8aba66ef6a0e17f08a0574e4820653e31b45a0. I added a bunch more tests for `Layout::from_size_align` and `Layout::array` too.
[stable] Prepare 1.64.0 release This PR prepares the 1.64.0 stable release builds. In addition to bumping the channel and including the latest release notes changes, this PR also backports the following PRs: * rust-lang#100852 * rust-lang#101366 * rust-lang#101468 * rust-lang#101922 This PR also reverts the following PRs, as decided in rust-lang#101899 (comment): * rust-lang#95295 * rust-lang#99136 (followup to the previous PR) r? `@ghost` cc `@rust-lang/release`
An attempt to improve perf by @nnethercote's approach suggested in #99117