-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse the mem::swap optimizations to speed up slice::rotate #42819
Conversation
Exposes the swapping logic from PR 40454 as `pub unsafe fn ptr::swap_nonoverlapping` under feature swap_nonoverlapping This is most helpful for compound types where LLVM didn't vectorize the loop. Highlight: bench slice::rotate_medium_by727_strings gets 38% faster.
r? @sfackler (rust_highfive has picked a reviewer for you, use r? to override) |
friendly ping @sfackler, pinging you on IRC too! |
@bors r+ |
📌 Commit 47fa016 has been approved by |
⌛ Testing commit 47fa016 with merge 240dcc4d98ed5e20f3d3177ac77debc65aa131e9... |
💔 Test failed - status-travis |
@bors: retry
|
Reuse the mem::swap optimizations to speed up slice::rotate This is most helpful for compound types where LLVM didn't vectorize the loop. Highlight: bench slice::rotate_medium_by727_strings gets 38% faster. Exposes the swapping logic from PR #40454 as `pub unsafe fn ptr::swap_nonoverlapping` under library feature `swap_nonoverlapping` #42818. (The new method seemed plausible, and was the simplest way to share the logic. I'm not attached to it, though, so let me know if a different way would be better.)
☀️ Test successful - status-appveyor, status-travis |
pub unsafe fn swap_nonoverlapping<T>(x: *mut T, y: *mut T, count: usize) { | ||
let x = x as *mut u8; | ||
let y = y as *mut u8; | ||
let len = mem::size_of::<T>() * count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this use .checked_mul(count).unwrap()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's necessary, because of the safety requirement that the arguments refer to valid ranges of memory -- if it's valid, its length must fit in a usize
.
This is a workaround for rust-lang#42778, which was git-bisected to rust-lang#40454's optimizations to `mem::swap`, later moved to `ptr` in rust-lang#42819. Natively compiled rustc couldn't even compile stage1 libcore on powerpc64 and s390x, but they work fine without this `repr(simd)`. Since powerpc64le works OK, it seems probably related to being big-endian. The underlying problem is not yet known, but this at least makes those architectures functional again in the meantime. cc @arielb1
Disable big-endian simd in swap_nonoverlapping_bytes This is a workaround for rust-lang#42778, which was git-bisected to rust-lang#40454's optimizations to `mem::swap`, later moved to `ptr` in rust-lang#42819. Natively compiled rustc couldn't even compile stage1 libcore on powerpc64 and s390x, but they work fine without this `repr(simd)`. Since powerpc64le works OK, it seems probably related to being big-endian. The underlying problem is not yet known, but this at least makes those architectures functional again in the meantime. cc @arielb1
This is most helpful for compound types where LLVM didn't vectorize the loop. Highlight: bench slice::rotate_medium_by727_strings gets 38% faster.
Exposes the swapping logic from PR #40454 as
pub unsafe fn ptr::swap_nonoverlapping
under library featureswap_nonoverlapping
#42818.(The new method seemed plausible, and was the simplest way to share the logic. I'm not attached to it, though, so let me know if a different way would be better.)