-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize str::from_utf8() validation when slice contains multibyte chars and str.chars().count() in all cases #88834
Conversation
r? @yaahc (rust-highfive has picked a reviewer for you, use r? to override) |
perf run since since @bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit ce90315d5ffbf771cea57bf5060eec2e8a5455bb with merge fa3e0445c3bd996c5883d0948e712f6eb91e4b38... |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
it shows consistent improvements across several x86_64 feature levels ``` old, -O2, x86-64 test str::str_char_count_emoji ... bench: 1,924 ns/iter (+/- 26) test str::str_char_count_lorem ... bench: 879 ns/iter (+/- 12) test str::str_char_count_lorem_short ... bench: 5 ns/iter (+/- 0) new, -O2, x86-64 test str::str_char_count_emoji ... bench: 1,878 ns/iter (+/- 21) test str::str_char_count_lorem ... bench: 851 ns/iter (+/- 11) test str::str_char_count_lorem_short ... bench: 4 ns/iter (+/- 0) old, -O2, x86-64-v2 test str::str_char_count_emoji ... bench: 1,477 ns/iter (+/- 46) test str::str_char_count_lorem ... bench: 675 ns/iter (+/- 15) test str::str_char_count_lorem_short ... bench: 5 ns/iter (+/- 0) new, -O2, x86-64-v2 test str::str_char_count_emoji ... bench: 1,323 ns/iter (+/- 39) test str::str_char_count_lorem ... bench: 593 ns/iter (+/- 18) test str::str_char_count_lorem_short ... bench: 4 ns/iter (+/- 0) old, -O2, x86-64-v3 test str::str_char_count_emoji ... bench: 748 ns/iter (+/- 7) test str::str_char_count_lorem ... bench: 348 ns/iter (+/- 2) test str::str_char_count_lorem_short ... bench: 5 ns/iter (+/- 0) new, -O2, x86-64-v3 test str::str_char_count_emoji ... bench: 650 ns/iter (+/- 4) test str::str_char_count_lorem ... bench: 301 ns/iter (+/- 1) test str::str_char_count_lorem_short ... bench: 5 ns/iter (+/- 0) ```
…byte chars ``` old, -O2, x86-64 test str::str_validate_emoji ... bench: 4,606 ns/iter (+/- 64) new, -O2, x86-64 test str::str_validate_emoji ... bench: 3,837 ns/iter (+/- 60) ```
⌛ Trying commit 66195d8 with merge a80e5872cb4aecf1c759ad6e8ae0b9a3297fdb2f... |
☀️ Try build successful - checks-actions |
Queued a80e5872cb4aecf1c759ad6e8ae0b9a3297fdb2f with parent b69fe57, future comparison URL. |
Finished benchmarking commit (a80e5872cb4aecf1c759ad6e8ae0b9a3297fdb2f): comparison url. Summary: This change led to small relevant mixed results 🤷 in compiler performance.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never |
let's see if it was due to the extra function calls even though they should be inlined. @bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 5e1428e with merge ddb82bd66bb17f60beb471f9c8b345a5e1130e56... |
☀️ Try build successful - checks-actions |
Queued ddb82bd66bb17f60beb471f9c8b345a5e1130e56 with parent 4e880f8, future comparison URL. |
Finished benchmarking commit (ddb82bd66bb17f60beb471f9c8b345a5e1130e56): comparison url. Summary: This change led to small relevant mixed results 🤷 in compiler performance.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never |
Several of the improved and regressed benchmarks spend less/more time in |
I think the microbenchmark results seem clear here, and the rest seems likely to be noise. On balance, this seems likely to be a win. @bors r+ |
📌 Commit 5e1428e has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (175b8db): comparison url. Summary: This benchmark run did not return any relevant changes. If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression |
The change shows small but consistent improvements across several x86 target feature levels. I also tried to optimize counting with
slice.as_chunks
but that yielded more inconsistent results, bigger improvements for some optimization levels, lesser ones in others.and for the multibyte-char string validation: