Avoid calling Encoding.GetBytes when Vector128 is not hardware accelerated #85267

Daniel-Svensson · 2023-04-24T18:22:50Z

A performance regression (dotnet/perf-autofiling-issues#15719 ) was detected when running with mono interpreter which might be related to #73336

This is a simple follow up which disables the fast path for encoding introduced in #73336 if SIMD hardware acceleration is not availible.

Some measurements on Zen 3 hardware (with SSE disabled) shows that

for 32bit mode the simple loop is faster even above 100 characters
for 64bit mode the simple loop is faster up to 50 characters so it should have a cutoff at ~50

This version just disables the new path when hardware acceleration is missing, but I have prepared an alternative version which is smarter in 64bit mode when simd is missing in case you prefer that version instead.

I assumed the simples possible code was preferred since I believe hardware acceleration is available on the majority of hardware.

stephentoub · 2023-05-15T17:05:19Z

src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlStreamNodeWriter.cs

@@ -361,7 +362,7 @@ protected unsafe int UnsafeGetUTF8Chars(char* chars, int charCount, byte[] buffe
                fixed (byte* _bytes = &buffer[offset])
                {
                    // Fast path for small strings, use Encoding.GetBytes for larger strings since it is faster when vectorization is possible
-                    if ((uint)charCount < 32)
+                    if (!Vector128.IsHardwareAccelerated || (uint)charCount < 32)


Where did the "32" come from? Vector128<ushort>.Count is 8 and Vector256<ushort>.Count is 16, so this is either 2x a 256-bit vector or 4x a 128-bit vector. Should it be if (!Vector128.IsHardwareAccelerated || charCount < Vector128<ushort>.Count?

The number is based on local benchmarks, it was originally 25 but was rounded up 32 after one of the datacontract microbenchmarks had a slight regression for me.

I hope to decrease the limit further after some more performance improvements to the ascii encoding.

It currently happens to be 2× sizeof(Vector128) (same as current limit of for hardware instrincts)

StephenMolloy · 2023-05-22T23:21:48Z

The change looks fine to me. Generally speaking, as expressed in the conversation about #73336, these nitty-gritty improvements really should go to the Encoding libraries. Ideally, the serializer wouldn't even have it's own optimized implementations. However, since this is restoring an old path (in a less-common case) to match the behavior seen prior to #73336 in 7.0 and earlier, I feel fine taking the fine-tuning here.

StephenMolloy · 2023-05-23T02:56:34Z

/azp run runtime

azure-pipelines · 2023-05-23T02:56:39Z

You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list.

azure-pipelines · 2023-05-23T03:00:58Z

Azure Pipelines successfully started running 1 pipeline(s).

StephenMolloy · 2023-05-23T18:51:22Z

Per this documentation I'm closing and re-opening this PR to get the license/cla requirement re-evaluated. :/

Always use simple while loop when simd is not supported

b74aa60

dotnet-issue-labeler bot added the area-Serialization label Apr 24, 2023

ghost added the community-contribution Indicates that the PR has been added by a community member label Apr 24, 2023

Daniel-Svensson changed the title ~~Avoid calling Encoding.GetBytes when Vector128 is not hardware accellerated~~ Avoid calling Encoding.GetBytes when Vector128 is not hardware accelerated Apr 24, 2023

This was referenced Apr 24, 2023

[wasm] interpreter timeouts when WebSocket closes unexpectedly #84101

Closed

Various WASM timeouts on CI #85304

Closed

stephentoub reviewed May 15, 2023

View reviewed changes

StephenMolloy approved these changes May 23, 2023

View reviewed changes

StephenMolloy closed this May 23, 2023

StephenMolloy reopened this May 23, 2023

StephenMolloy merged commit 71ce8a9 into dotnet:main May 23, 2023

build-analysis bot mentioned this pull request May 23, 2023

AppHost.Bundle.Tests.BundleLocalizedApp.Bundled_Localized_App_Run_Succeeds failing #85993

Closed

ghost locked as resolved and limited conversation to collaborators Jun 22, 2023

Daniel-Svensson deleted the datacontract_cutoff branch September 30, 2023 11:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid calling Encoding.GetBytes when Vector128 is not hardware accelerated #85267

Avoid calling Encoding.GetBytes when Vector128 is not hardware accelerated #85267

Daniel-Svensson commented Apr 24, 2023 •

edited

Loading

stephentoub May 15, 2023

Daniel-Svensson May 16, 2023

StephenMolloy commented May 22, 2023

StephenMolloy commented May 23, 2023 •

edited

Loading

azure-pipelines bot commented May 23, 2023

azure-pipelines bot commented May 23, 2023

StephenMolloy commented May 23, 2023

Avoid calling Encoding.GetBytes when Vector128 is not hardware accelerated #85267

Avoid calling Encoding.GetBytes when Vector128 is not hardware accelerated #85267

Conversation

Daniel-Svensson commented Apr 24, 2023 • edited Loading

stephentoub May 15, 2023

Choose a reason for hiding this comment

Daniel-Svensson May 16, 2023

Choose a reason for hiding this comment

StephenMolloy commented May 22, 2023

StephenMolloy commented May 23, 2023 • edited Loading

azure-pipelines bot commented May 23, 2023

azure-pipelines bot commented May 23, 2023

StephenMolloy commented May 23, 2023

Daniel-Svensson commented Apr 24, 2023 •

edited

Loading

StephenMolloy commented May 23, 2023 •

edited

Loading