Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorize basic_string::find_first_of #4744

Merged
merged 29 commits into from
Sep 4, 2024

Conversation

AlexGuteniev
Copy link
Contributor

@AlexGuteniev AlexGuteniev commented Jun 23, 2024

Resolves #4497

Benchmark main this
bm<AlgType::str_member, char>/2/3 5.52 ns 5.84 ns
bm<AlgType::str_member, char>/7/4 12.7 ns 12.7 ns
bm<AlgType::str_member, char>/9/3 13.1 ns 11.4 ns
bm<AlgType::str_member, char>/22/5 16.9 ns 11.9 ns
bm<AlgType::str_member, char>/58/2 30.9 ns 13.1 ns
bm<AlgType::str_member, char>/102/4 44.8 ns 15.6 ns
bm<AlgType::str_member, char>/325/1 101 ns 36.0 ns
bm<AlgType::str_member, char>/1011/11 269 ns 100 ns
bm<AlgType::str_member, char>/1502/23 400 ns 316 ns
bm<AlgType::str_member, char>/3056/7 763 ns 286 ns
bm<AlgType::str_member, wchar_t>/2/3 6.11 ns 14.2 ns
bm<AlgType::str_member, wchar_t>/7/4 12.2 ns 16.2 ns
bm<AlgType::str_member, wchar_t>/9/3 12.6 ns 14.6 ns
bm<AlgType::str_member, wchar_t>/22/5 17.3 ns 15.7 ns
bm<AlgType::str_member, wchar_t>/58/2 36.1 ns 20.0 ns
bm<AlgType::str_member, wchar_t>/102/4 55.7 ns 27.2 ns
bm<AlgType::str_member, wchar_t>/325/1 137 ns 66.9 ns
bm<AlgType::str_member, wchar_t>/1011/11 392 ns 400 ns
bm<AlgType::str_member, wchar_t>/1502/23 576 ns 587 ns
bm<AlgType::str_member, wchar_t>/3056/7 1142 ns 574 ns
bm<AlgType::str_member, wchar_t, L'\x03B1'>/2/3 7.12 ns 15.9 ns
bm<AlgType::str_member, wchar_t, L'\x03B1'>/7/4 24.3 ns 27.2 ns
bm<AlgType::str_member, wchar_t, L'\x03B1'>/9/3 26.3 ns 14.6 ns
bm<AlgType::str_member, wchar_t, L'\x03B1'>/22/5 63.3 ns 15.7 ns
bm<AlgType::str_member, wchar_t, L'\x03B1'>/58/2 61.3 ns 19.5 ns
bm<AlgType::str_member, wchar_t, L'\x03B1'>/102/4 134 ns 26.9 ns
bm<AlgType::str_member, wchar_t, L'\x03B1'>/325/1 141 ns 66.2 ns
bm<AlgType::str_member, wchar_t, L'\x03B1'>/1011/11 2953 ns 429 ns
bm<AlgType::str_member, wchar_t, L'\x03B1'>/1502/23 9272 ns 908 ns
bm<AlgType::str_member, wchar_t, L'\x03B1'>/3056/7 5572 ns 560 ns

@AlexGuteniev AlexGuteniev requested a review from a team as a code owner June 23, 2024 08:25
@StephanTLavavej StephanTLavavej self-assigned this Jun 24, 2024
@StephanTLavavej StephanTLavavej added the performance Must go faster label Jun 24, 2024
@StephanTLavavej StephanTLavavej changed the title Vectrorize basic_string::find_first_of Vectorize basic_string::find_first_of Jun 25, 2024
# Conflicts:
#	tests/std/tests/VSO_0000000_vector_algorithms/test.cpp
We were falling through to this.
Now that the "use the bitmap" and "couldn't use the bitmap" codepaths
always return, we can positively test for "use the bitmap".
This is a clarity improvement all by itself - it's easier
to understand "if we can do cool thing, do it".
The assignment `_Use_bitmap = false;` is guarded by `_Try_vectorize && (STUFF)`.

So when we skip using the bitmap, the later condition `if (_Try_vectorize)` is guaranteed to be active.
And silence warning C4127: conditional expression is constant.
I'm giving up the "bitmap" optimization for constexpr evaluation.

Now we can mark `const bool _Try_vectorize` without harming perf.
stl/inc/__msvc_string_view.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_string_view.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_string_view.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_string_view.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_string_view.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_string_view.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_string_view.hpp Outdated Show resolved Hide resolved
benchmarks/src/find_first_of.cpp Outdated Show resolved Hide resolved
tests/std/tests/VSO_0000000_vector_algorithms/test.cpp Outdated Show resolved Hide resolved
tests/std/tests/VSO_0000000_vector_algorithms/test.cpp Outdated Show resolved Hide resolved
@StephanTLavavej StephanTLavavej removed their assignment Aug 30, 2024
@AlexGuteniev
Copy link
Contributor Author

My experiment on having _Use_bitmap to always true or always false

Benchmark never bitmap always bitmap
bm<AlgType::str_member, char>/22/5 11.3 ns 17.5 ns
bm<AlgType::str_member, char>/58/2 12.6 ns 41.2 ns
bm<AlgType::str_member, char>/102/4 15.4 ns 54.8 ns
bm<AlgType::str_member, char>/325/1 35.7 ns 132 ns
bm<AlgType::str_member, char>/1011/11 96.8 ns 355 ns
bm<AlgType::str_member, char>/1502/23 320 ns 607 ns
bm<AlgType::str_member, char>/3056/7 282 ns 1026 ns
bm<AlgType::str_member, wchar_t>/22/5 12.5 ns 17.8 ns
bm<AlgType::str_member, wchar_t>/58/2 16.9 ns 31.5 ns
bm<AlgType::str_member, wchar_t>/102/4 23.4 ns 54.6 ns
bm<AlgType::str_member, wchar_t>/325/1 63.0 ns 134 ns
bm<AlgType::str_member, wchar_t>/1011/11 406 ns 376 ns
bm<AlgType::str_member, wchar_t>/1502/23 870 ns 553 ns
bm<AlgType::str_member, wchar_t>/3056/7 514 ns 1100 ns
bm<AlgType::str_member, char32_t>/22/5 10.7 ns 21.3 ns
bm<AlgType::str_member, char32_t>/58/2 9.36 ns 43.0 ns
bm<AlgType::str_member, char32_t>/102/4 16.7 ns 58.9 ns
bm<AlgType::str_member, char32_t>/325/1 18.8 ns 134 ns
bm<AlgType::str_member, char32_t>/1011/11 338 ns 385 ns
bm<AlgType::str_member, char32_t>/1502/23 1015 ns 639 ns
bm<AlgType::str_member, char32_t>/3056/7 541 ns 1109 ns

@AlexGuteniev

This comment was marked as resolved.

@AlexGuteniev AlexGuteniev marked this pull request as draft September 3, 2024 04:29
@frederick-vs-ja
Copy link
Contributor

frederick-vs-ja commented Sep 3, 2024

I had the assumption that _Special is the case where we definitely have one of the standard character types. But apparently char_traits can be specialized for custom character types, so _Special may be true for something not behaving as a character.

It seems that we should fix _Special for all _Traits_meow algorithms. Perhaps in another PR?
(Also, it's possibly better to make _Special not a template parameter, which would shorten mangled names.)

The strategy may be like this:

template <class>
constexpr bool _Is_std_char_traits_specialization = false;
template <class _CharT>
constexpr bool _Is_std_char_traits_specialization<char_traits<_CharT>> = _Is_EcharT<_CharT>;

@AlexGuteniev
Copy link
Contributor Author

AlexGuteniev commented Sep 3, 2024

It seems that we should fix _Special for all _Traits_meow algorithms. Perhaps in another PR?

Yes, I think I can do this some time later (If you or someone else won't do it before me).
I'd prefer keeping the current PR draft, until it can be merged with main containing the changes of that PR.
This way basic_string won't become more broken for odd character types.

(Also, it's possibly better to make _Special not a template parameter, which would shorten mangled names.)

Good idea.

The strategy may be like this:

Also agree.

@AlexGuteniev AlexGuteniev marked this pull request as ready for review September 3, 2024 17:43
@AlexGuteniev
Copy link
Contributor Author

I'm marking this PR as ready again.

Turns out that basic_string::find_meow_of rejects unicorns here:

static_assert(is_unsigned_v<_Elem>,
"Standard char_traits is only provided for char, wchar_t, char16_t, and char32_t. See N4950 [char.traits]. "
"Visual C++ accepts other unsigned integral types as an extension.");

Since it is compile-time error already, vectorization doesn't make this case any more broken.

@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej StephanTLavavej merged commit 77b31f7 into microsoft:main Sep 4, 2024
39 checks passed
@StephanTLavavej
Copy link
Member

🧶 🐈‍⬛ 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Further optimize find_first_of
4 participants