-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vec::dedup_by optimization #82191
Vec::dedup_by optimization #82191
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @sfackler (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
|
Would be a |
Dropping IMO it would be better to remove (but not drop!) |
@sfackler this is ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this changeset needs a significant test coverage (which this function I believe lacks right now). In particular testing cases when the comparator panics is something we're lacking.
Also benchmarks or machine code comparisons would be helpful as well.
I have measured the performance to be around 10-20% better on average |
Yeah, I mean demonstrating the improvement with some output from benchmark runs (together with links to or commits containing said benchmark code) |
library/alloc/tests/vec.rs
Outdated
@@ -2141,4 +2267,4 @@ fn test_extend_from_within_panicing_clone() { | |||
std::panic::catch_unwind(move || vec.extend_from_within(..)).unwrap_err(); | |||
|
|||
assert_eq!(count.load(Ordering::SeqCst), 4); | |||
} | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have a trailing newline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think x.py fmt
removes it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, that's surprising. Would be the first time I see it doing so.
Should I include the benchmark inside |
Yeah, that sounds like a good place for it. |
@bors r+ Thank you! |
📌 Commit b0092bc has been approved by |
btw, the failing check is not my fault 😅
|
try add some comment to trigger the check again? |
no need to bother, that's just the pre-land check which is not super relevant at this point. bors will get to it eventually. |
Vec::dedup_by optimization Now `Vec::dedup_by` drops items in-place as it goes through them. From my benchmarks, it is around 10% faster when T is small, with no major regression when otherwise. I used `ptr::copy` instead of conditional `ptr::copy_nonoverlapping`, because the latter had some weird performance issues on my ryzen laptop (it was 50% slower on it than on intel/sandybridge laptop) It would be good if someone was able to reproduce these results.
Vec::dedup_by optimization Now `Vec::dedup_by` drops items in-place as it goes through them. From my benchmarks, it is around 10% faster when T is small, with no major regression when otherwise. I used `ptr::copy` instead of conditional `ptr::copy_nonoverlapping`, because the latter had some weird performance issues on my ryzen laptop (it was 50% slower on it than on intel/sandybridge laptop) It would be good if someone was able to reproduce these results.
Rollup of 11 pull requests Successful merges: - rust-lang#82191 (Vec::dedup_by optimization) - rust-lang#82270 (Emit error when trying to use assembler syntax directives in `asm!`) - rust-lang#82434 (Add more links between hash and btree collections) - rust-lang#83080 (Make source-based code coverage compatible with MIR inlining) - rust-lang#83168 (Extend `proc_macro_back_compat` lint to `procedural-masquerade`) - rust-lang#83192 (ci/docker: Add SDK/NDK level 21 to android docker for 32bit platforms) - rust-lang#83204 (Simplify C compilation for Fortanix-SGX target) - rust-lang#83216 (Allow registering tool lints with `register_tool`) - rust-lang#83223 (Display error details when a `mmap` call fails) - rust-lang#83228 (Don't show HTML diff if tidy isn't installed for rustdoc tests) - rust-lang#83231 (Switch riscvgc-unknown-none-elf use lp64d ABI) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
fn random_sorted_fill(mut seed: u32, buf: &mut [u32]) { | ||
let mask = if buf.len() < 8192 { | ||
0xFF | ||
} else if buf.len() < 200_000 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 200000?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's mostly there to not have too much duplicates, idk
Now
Vec::dedup_by
drops items in-place as it goes through them.From my benchmarks, it is around 10% faster when T is small, with no major regression when otherwise.
I used
ptr::copy
instead of conditionalptr::copy_nonoverlapping
, because the latter had some weird performance issues on my ryzen laptop (it was 50% slower on it than on intel/sandybridge laptop)It would be good if someone was able to reproduce these results.