-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid calling Encoding.GetBytes when Vector128 is not hardware accelerated #85267
Avoid calling Encoding.GetBytes when Vector128 is not hardware accelerated #85267
Conversation
@@ -361,7 +362,7 @@ protected unsafe int UnsafeGetUTF8Chars(char* chars, int charCount, byte[] buffe | |||
fixed (byte* _bytes = &buffer[offset]) | |||
{ | |||
// Fast path for small strings, use Encoding.GetBytes for larger strings since it is faster when vectorization is possible | |||
if ((uint)charCount < 32) | |||
if (!Vector128.IsHardwareAccelerated || (uint)charCount < 32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did the "32" come from? Vector128<ushort>.Count
is 8 and Vector256<ushort>.Count
is 16, so this is either 2x a 256-bit vector or 4x a 128-bit vector. Should it be if (!Vector128.IsHardwareAccelerated || charCount < Vector128<ushort>.Count
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number is based on local benchmarks, it was originally 25 but was rounded up 32 after one of the datacontract microbenchmarks had a slight regression for me.
I hope to decrease the limit further after some more performance improvements to the ascii encoding.
It currently happens to be 2× sizeof(Vector128) (same as current limit of for hardware instrincts)
The change looks fine to me. Generally speaking, as expressed in the conversation about #73336, these nitty-gritty improvements really should go to the Encoding libraries. Ideally, the serializer wouldn't even have it's own optimized implementations. However, since this is restoring an old path (in a less-common case) to match the behavior seen prior to #73336 in 7.0 and earlier, I feel fine taking the fine-tuning here. |
/azp run runtime |
You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list. |
Azure Pipelines successfully started running 1 pipeline(s). |
Per this documentation I'm closing and re-opening this PR to get the license/cla requirement re-evaluated. :/ |
A performance regression (dotnet/perf-autofiling-issues#15719 ) was detected when running with mono interpreter which might be related to #73336
This is a simple follow up which disables the fast path for encoding introduced in #73336 if SIMD hardware acceleration is not availible.
Some measurements on Zen 3 hardware (with SSE disabled) shows that
This version just disables the new path when hardware acceleration is missing, but I have prepared an alternative version which is smarter in 64bit mode when simd is missing in case you prefer that version instead.
I assumed the simples possible code was preferred since I believe hardware acceleration is available on the majority of hardware.