-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suboptimal generated code for testing that a &[u8; 4] contains all zeros #71602
Comments
As far as I can tell, this comes from an “optimization” in the source code of comparing slices: impl<A> SlicePartialEq<A> for [A]
where
A: PartialEq<A> + BytewiseEquality,
{
fn equal(&self, other: &[A]) -> bool {
if self.len() != other.len() {
return false;
}
// **** THIS PART vv
if self.as_ptr() == other.as_ptr() {
return true;
}
// **** THIS PART ^^
unsafe {
let size = mem::size_of_val(self);
memcmp(self.as_ptr() as *const u8, other.as_ptr() as *const u8, size) == 0
}
}
} |
If you call the function with |
FWIW, you do not need a transmute, https://doc.rust-lang.org/nightly/std/primitive.u32.html#method.from_ne_bytes will do the same (and is safe). I'm uncertain how much the ptr-equality check there buys us - obviously, in some cases, it can be a big win, but I suspect the common case in most code isn't comparing literally the same slice. This was added in #61665 with libs team approval; unfortunately the benchmark results are gone now -- I would expect it to have little impact on the vast majority of cases though, since it's a really cheap comparison in general. In this case though it's obviously suboptimal. Personally I'd probably be in favor of dropping that case from std, if there's cases where it's important for people (e.g. interning slices or so) they can likely wrap things in a custom struct. |
Adding |
@rustbot claim |
Does rust have const prop for slice? |
Remove pointer comparison from slice equality This resurrects rust-lang#71735. Fixes rust-lang#71602, helps with rust-lang#80140. r? `@Mark-Simulacrum`
I expected comparing a [u8; 4] to [0; 4] to generate code identical to testing a u32 is 0, and indeed it is the case when comparing values. However when dereferencing is involved the code is longer:
gives:
https://godbolt.org/z/3GaUfy
Transmuting to u32 gives the optimized code:
This issue has been assigned to @samrat via this comment.
The text was updated successfully, but these errors were encountered: