-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iterating with step_by(1) is much slower than without #57517
Comments
Godbolt link: https://rust.godbolt.org/z/scU_bY I've slightly changed the code to actually modify the array (wrapping_add returns the value -- it does not operate in-place). One bit that stands out especially to me is this:
I would have expected something (possibly jump threading?) to optimize away the |
Yeah, I forgot to use the returned addition value. Changed that in my code already too ) |
I wonder if this would be fixed with the change discussed in #52065. Also, try using |
I don't think #64121 fixes this issue, as it affects internal iteration only, while this one is about external iteration. |
@nikic You're right, but now it's just a matter of " |
There's another example of this coming up here in this thread: https://www.reddit.com/r/learnrust/comments/glxa5r/rust_vs_cpp_speed_in_implementation/ Rust vs C++ example where C++ takes 600-700ms and Rust takes 4300ms. |
Looks like it's fixed on beta and nightly. It hasn't reached stable yet. |
I believe step_by was one of the iterators with improved optimization in LLVM 16. I thought I added a codegen test for that, but apparently not. We should add a test for this, but probably not the one from OP, but something with simpler IR. |
Greetings!
I'd like to report that I get some significant (negative) performance impact when calling
step_by(1)
on an iterator. I created a repository with a detailed description of the issue and benchmarks to reproduce it: https://github.com/mvlabat/step_by_oneThese are the functions I tested:
Calling
iter_step(vec![1; LARGE_ENOUGH], 1)
computes 1.75x slower thaniter_default_step(vec![1; LARGE_ENOUGH])
with
const LARGE_ENOUGH: usize = 10_000_000;
.I'm running Macbook Pro 2015 with Intel i5-5257U CPU (2.70GHz).
My rustc version:
1.33.0-nightly (c2d381d39 2019-01-10)
.These are the exact benchmark results I got:
In the repository README there are also links to the generated asm code of these two functions.
The text was updated successfully, but these errors were encountered: