[wasm] Major performance regression serializing and deserializing json #50260

lewing · 2021-03-25T21:22:58Z

From 0f64b26...5b77db9 there was a large regression in the aspnetcore blazor json tests. That is a large span of commits because aspnetcore wasn't able to ingest the runtime for an extended period.

cc @marek-safar @danmoseley

The text was updated successfully, but these errors were encountered:

ghost · 2021-03-25T21:23:03Z

Tagging subscribers to this area: @eiriktsarpalis, @layomia
See info in area-owners.md if you want to be subscribed.

Issue Details

From 0f64b26...5b77db9 there was a large regression in the aspnetcore blazor json tests. That is a large span of commits because aspnetcore wasn't able to ingest the runtime for an extended period.

cc @marek-safar @danmoseley

Author:	lewing
Assignees:	-
Labels:	`area-System.Text.Json`, `tenet-performance`, `untriaged`
Milestone:	-

ghost · 2021-03-25T21:23:18Z

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

From 0f64b26...5b77db9 there was a large regression in the aspnetcore blazor json tests. That is a large span of commits because aspnetcore wasn't able to ingest the runtime for an extended period.

cc @marek-safar @danmoseley

Author:	lewing
Assignees:	-
Labels:	`arch-wasm`, `area-System.Text.Json`, `tenet-performance`, `untriaged`
Milestone:	-

lewing · 2021-03-25T21:30:58Z

There seem to be several changes around text processing in that set. Any thoughts @GrabYourPitchforks

GrabYourPitchforks · 2021-03-26T00:54:48Z

@lewing I'm not quite sure what tests you're referring to, but check out the below benchmark file.

https://github.com/GrabYourPitchforks/ConsoleApplicationBenchmark/blob/db476e4c64449ade3b9077c755d5aa513d8b7bb5/ConsoleAppBenchmark/TextEncoderRunner.cs

This benchmark exercises specifically the System.Text.Encodings.Web package, separate from System.Text.Json. If you see regressions in this benchmark file, that would be strong evidence that the PR #49373 really is responsible for this.

Is it possible to run these benchmarks against the wasm runtime to validate this theory? (You can make the benchmarks run a bit faster by removing the "HTML" and "URL" benchmarks on line 22 of that file.)

The logic inside the S.T.E.W project - even for the non-intrinsicified paths - is tuned so that it emits optimized codegen. It's possible that these optimizations benefit the typical codegen scenario but harm wasm. If necessary, we could investigate creating special -wasm.cs versions of these files which contain code patterns better suited for wasm scenarios. Let me know what you think of that idea.

GrabYourPitchforks · 2021-03-26T01:00:58Z

I should also mention: System.Text.Encodings.Web is not involved with deserialization, only with serialization. If you're seeing regressions in deserialization-only scenarios, that would require pulling in the JSON crew.

lewing · 2021-03-26T01:59:55Z

The regression appears to be symmetric so that is a good indication that it is not your change. Someone from @dotnet/aspnet-blazor-eng can probably point to the specific code under test better than I can.

steveharter · 2021-03-29T17:16:26Z

@lewing do you know what was the smaller regression around March 21st that went away? If it was a fluke, I wonder if the current regression could be as well.

pranavkm · 2021-03-29T17:21:30Z

@steveharter we ran the benchmarks a few times and the regression was consistent. Seems less likely it's transient.

lewing · 2021-03-29T18:31:27Z

Yeah, that was one of my first questions too, so we verified. On top of that several more normal benchmark runs have completed and the regression has persisted

steveharter · 2021-03-29T19:32:53Z

Yeah, that was one of my first questions too, so we verified. On top of that several more normal benchmark runs have completed and the regression has persisted

OK.

@GrabYourPitchforks

I should also mention: System.Text.Encodings.Web is not involved with deserialization, only with serialization. If you're seeing regressions in deserialization-only scenarios, that would require pulling in the JSON crew.

The property caching mechanism in the serializer attempts to pre-generate some JSON which does call JavaScriptEncoder.FindFirstCharacterToEncodeUtf8() which previously was heavily optimized for SSE2 and had some short-circuits for smaller lengths. This is done even when deserializing (one can argue it should be deferred, but that's a separate topic).

So it is possible #49373 is responsible although since the results are cached by the serializer, it would seem unlikely unless:

The benchmarks measure start-up time (first deserialize)
The new SSSE3 startup cost is super high.
Blazor interpreter intrinsics were changed; perhaps supported SSE2 but not SSSE3?

GrabYourPitchforks · 2021-03-31T16:31:07Z

@lewing can we get a pointer to the benchmark code? That would clear up any confusion and also allow us to determine whether it's following System.Text.Json documented best practice.

GrabYourPitchforks · 2021-03-31T19:25:50Z

Also re-upping the ping to @dotnet/aspnet-blazor-eng so that we can get a pointer to the code.

pranavkm · 2021-03-31T19:33:39Z

https://github.com/dotnet/aspnetcore/blob/main/src/Components/benchmarkapps/Wasm.Performance/TestApp/Pages/Json.razor.

captainsafia · 2021-04-06T19:12:44Z

Some quick benchmarking on this reveals that the biggest regressions seem to be in the deserialization. No change in serialization.

Deserialization

iteration	6.0.100-preview.2.21158.2	6.0.100-preview.3.21202.5
Deserialize 0	35.755 milliseconds	43.25 milliseconds
Deserialize 1	1 milliseconds	3.25 milliseconds
Deserialize 2	0.8749 milliseconds	3.375 milliseconds
Deserialize 3	0.9599 milliseconds	3.04 milliseconds
Deserialize 4	0.83 milliseconds	3 milliseconds
Deserialize 5	0.92 milliseconds	3 milliseconds
Deserialize 6	0.88 milliseconds	3.165 milliseconds
Deserialize 7	0.79 milliseconds	3.04 milliseconds
Deserialize 8	0.835 milliseconds	3 milliseconds
Deserialize 9	0.915 milliseconds	3 milliseconds
Deserialize 10	0.79 milliseconds	3.085 milliseconds

Serialization

iteration	6.0.100-preview.2.21158.2	6.0.100-preview.3.21202.5
Serialize 0	11.455 milliseconds	11.8299 milliseconds
Serialize 1	1.71 milliseconds	2.085 milliseconds
Serialize 2	1.545 milliseconds	1.875 milliseconds
Serialize 3	1.545 milliseconds	1.875 milliseconds
Serialize 4	1.7499 milliseconds	2.085 milliseconds
Serialize 5	1.795 milliseconds	1.96 milliseconds
Serialize 6	1.875 milliseconds	2.04 milliseconds
Serialize 7	1.755 milliseconds	1.9199 milliseconds
Serialize 8	1.585 milliseconds	2.125 milliseconds
Serialize 9	1.625 milliseconds	1.8749 milliseconds
Serialize 10	1.545 milliseconds	1.8299 milliseconds

Test repo is https://github.com/captainsafia/TestJsonRegression. The benchmarking code was implemented in Program.Main of a WASM app to reduce the impact of any external factors in Blazor.

GrabYourPitchforks · 2021-04-06T21:59:11Z

Thanks @captainsafia for these numbers. I'll leave this pathed in System.Text.Json for now.

vargaz · 2021-04-08T23:03:47Z

This is caused by
197a28d

danmoseley · 2021-04-09T00:01:05Z

#49372

GrabYourPitchforks · 2021-04-09T00:51:29Z

@pranavkm @captainsafia @lewing If I provided a patch file for dotnet/runtime with a proposed workaround, would you be able to apply the patch, rebuild the wasm-specific System.Private.CoreLib.dll, drop it into your benchmark app, and see if it resolves your perf issue? Keep in mind you'll also want to compare the before/after of the size on disk of a trimmed Blazor app.

vargaz · 2021-04-09T00:53:34Z

I tested with a console app and reverting that commit does fix the issue, at least for deserialization.

GrabYourPitchforks · 2021-04-09T01:10:56Z

Try cherry-picking commit GrabYourPitchforks@d7eae3b (patch file here) and see if it resolves the issue. This re-introduces the fast path in the SizeOpt files before entering the main workhorse loop. I'm hoping it won't increase size on disk significantly, but I don't have a good way of testing that right now.

lewing · 2021-04-14T23:27:40Z

@radekdoulik can you follow up on #50260 (comment) and get us some numbers?

radekdoulik · 2021-04-15T11:31:17Z

I measured current HEAD and patched version of modified browser sample in Chrome. Later I added Preview 2 (no AOT as it was crashing) and reverted 197a28d measurements.

	HEAD	HEAD/AOT	patched	patched/AOT	Preview 2	Reverted	Reverted/AOT
serialize small	0.82ms	0.537ms	0.257ms	0.198ms	0.276ms	0.266ms	0.192ms
serialize large	282.4ms	170.5ms	74.3ms	51ms	70ms	76.8ms	49.7ms
deserialize small	1.09ms	0.568ms	0.428ms	0.265ms	0.454ms	0.424ms	0.245ms
deserialize large	337.2ms	199.1ms	115.7ms	67.6ms	121.9ms	112.5ms	61.3ms
dotnet.wasm size		8831403		8833528			`8839630`

It uses the Blazor benchmark code from Json.razor and Person.cs. The small serialization is measured over batch of 1000 calls and large over 10 calls. I ran it multiple times and used fastest measurement.

P2 was built on Mac and then run on the same machine/Chrome as the other sample builds.

Observation: the patched version is close to reverted version and still reduces the size of the dotnet.wasm file. Both are close to P2 times.

radekdoulik · 2021-04-15T13:38:55Z

I have created draft PR #51310 to check the GrabYourPitchforks@d7eae3b change on CI.

radekdoulik · 2021-04-15T13:45:57Z

@pavelsavara how does the GrabYourPitchforks@d7eae3b and the times look to you?

jkotas · 2021-04-15T15:21:42Z

@pavelsavara Is the performance for payloads that contain non-ASCII characters (e.g. Czech strings with diacritics) important for you? (#51310 (comment))

pavelsavara · 2021-04-15T15:57:20Z

@jkotas I'm newbie on the Mono team, so I don't really know. I did the original PR as part of my learning, based on GrabYourPitchforks guidance and motivated by size reduction. I don't have specific performance goal in mind.
My opinion is that 6Kb smaller uncompressed download is worth slower perf of non-ascii on wasm.

Also I think @marek-safar commented , that we could make it into (link time ?) feature flag, which could be exposed to developers, to make the choice.
https://github.com/dotnet/runtime/blob/main/docs/workflow/trimming/feature-switches.md
Is that worth the effort ?

@radekdoulik LGTM, unless we rather do the all or nothing feature flag.

jkotas · 2021-04-15T16:08:33Z

My opinion is that 6Kb smaller uncompressed download is worth slower perf of non-ascii on wasm.

We should get a better idea about the performance impact for non-ASCII payloads.

Majority of the world uses non-ASCII characters. It is likely that the native strings that use user-local alphabet are typically going to appear in some json payloads.

that we could make it into (link time ?) feature flag, which could be exposed to developers, to make the choice.

I do not think the linker switch is worth it for this case. It may be worth it for other cases where the difference is more signficant.

radekdoulik · 2021-04-15T16:47:14Z

I think the non-ASCII will remain slow. I will try to confirm it with measurements.

I do not think the linker switch is worth it for this case. It may be worth it for other cases where the difference is more signficant.

Maybe we can make this size optimization be part of InvariantCulture feature switch?

pavelsavara · 2021-04-15T16:47:37Z

There is growing list of *.SizeOpt.cs files. Perhaps the linker switch could be common for all of them ?

src\libraries\Common\src\System\Collections\Generic\LargeArrayBuilder.SizeOpt.cs
src\libraries\System.Linq\src\System\Linq\Enumerable.SizeOpt.cs
src\libraries\System.Linq\src\System\Linq\Skip.SizeOpt.cs
src\libraries\System.Linq\src\System\Linq\Take.SizeOpt.cs
src\libraries\System.Private.CoreLib\src\System\Text\Unicode\Utf8Utility.SizeOpt.cs

radekdoulik · 2021-04-16T13:23:41Z

Non-ASCII chars measurements. I have added some non-ASCII characters to the strings in the Person class.

	patched	Preview 2	HEAD
serialize small	0.629ms	0.554ms	2.054ms
serialize large	272.2ms	222.8ms	1107.2ms
deserialize small	1.241ms	1.022ms	2.145ms
deserialize large	613.3ms	457ms	1145.2ms

I also wanted to measure just text, so added measurement of [de]serialization of text containing 56777 chars. (repeating pangrams in Czech language)

	patched	Preview 2	HEAD
serialize text	9.17ms	7.03ms	45.83ms
deserialize text	27.23ms	12.58ms	51.34ms

jkotas · 2021-04-16T13:52:01Z

This data tells me that we should revert the size optimization change for the UTF8 encoding and take the size hit.

…nterpreted and low-footprint scenarios (dotnet#49372)" This reverts commit 197a28d. Context: dotnet#50260

radekdoulik · 2021-04-16T15:14:29Z

This data tells me that we should revert the size optimization change for the UTF8 encoding and take the size hit.

Yes, the perf hit is quite high. I have opened PR to revert it.

GrabYourPitchforks · 2021-04-16T16:34:35Z

Before we revert the change, should we try creating a patch which inlines the decode methods, as mentioned in another comment? Still trying to figure out if we can get a good balance of size savings & acceptable perf. But if this is bordering on contorting the code, so be it.

Context: #51310 (comment)

…nterpreted and low-footprint scenarios (#49372)" (#51379) This reverts commit 197a28d. Context: #50260

lewing · 2021-04-16T19:42:01Z

Before we revert the change, should we try creating a patch which inlines the decode methods, as mentioned in another comment? Still trying to figure out if we can get a good balance of size savings & acceptable perf. But if this is bordering on contorting the code, so be it.

Context: #51310 (comment)

It's reverted for now so we don't get a ton of reports about a known regression. We can still explore the options.

GrabYourPitchforks · 2021-04-17T00:31:47Z

It's reverted for now so we don't get a ton of reports about a known regression. We can still explore the options.

Is anybody on point for that exploration? I assume Radek, since the issue is assigned to him?

jkotas · 2021-04-17T00:45:12Z

I think investigating this encoding size optimization is in the area of diminishing returns. It would be more useful to investigate whether it is possible to make the Linq size vs. speed optimization configurable via trimming feature flag.

lewing · 2021-04-17T00:48:58Z

I think investigating this encoding size optimization is in the area of diminishing returns. It would be more useful to investigate whether it is possible to make the Linq size vs. speed optimization configurable via trimming feature flag.

Agreed, we've had customer reports of the Linq performance regression too and the size difference is larger

lewing · 2021-04-17T04:06:13Z

It's reverted for now so we don't get a ton of reports about a known regression. We can still explore the options.

Is anybody on point for that exploration? I assume Radek, since the issue is assigned to him?

On point in what sense?

GrabYourPitchforks · 2021-04-17T05:12:36Z

On point in what sense?

"We should explore..."

Since this issue wasn't closed, I interpreted this as a request for somebody to perform this work. But it wasn't apparent to me who this work was being requested of.

pranavkm · 2021-04-19T17:07:53Z

Has a fix for this landed in main? Our most recent benchmark run indicate that the regression has been mitigated?

benaadams · 2021-04-19T17:38:10Z

Was reverted #51379

steveharter · 2021-04-28T15:37:41Z

Can this be closed?

pranavkm · 2021-04-28T15:52:00Z

Yup. It's been behaving itself:

lewing added the tenet-performance Performance related issue label Mar 25, 2021

dotnet-issue-labeler bot added area-System.Text.Json untriaged New issue has not been triaged by the area owner labels Mar 25, 2021

lewing added the arch-wasm WebAssembly architecture label Mar 25, 2021

lewing changed the title ~~[wasm] major performance regression serializing and deserializing json~~ [wasm] Major performance regression serializing and deserializing json Mar 25, 2021

lewing mentioned this issue Mar 25, 2021

System.Text.Encodings.Web refactoring and code modernization #49373

Merged

lewing added this to the 6.0.0 milestone Mar 25, 2021

eiriktsarpalis assigned steveharter Mar 29, 2021

GrabYourPitchforks added area-System.Text.Encoding and removed area-System.Text.Json labels Apr 9, 2021

lewing assigned radekdoulik Apr 14, 2021

radekdoulik added a commit to radekdoulik/runtime that referenced this issue Apr 16, 2021

Revert "Include "simple" UTF-8 validation and transcoding logic for i…

5f723d4

…nterpreted and low-footprint scenarios (dotnet#49372)" This reverts commit 197a28d. Context: dotnet#50260

radekdoulik mentioned this issue Apr 16, 2021

[wasm] Revert utf8 size optimization #51379

Merged

jkotas pushed a commit that referenced this issue Apr 16, 2021

Revert "Include "simple" UTF-8 validation and transcoding logic for i…

5bbb9b6

…nterpreted and low-footprint scenarios (#49372)" (#51379) This reverts commit 197a28d. Context: #50260

benaadams mentioned this issue Apr 18, 2021

Faster wasm runtime #51458

Closed

pranavkm closed this as completed Apr 28, 2021

ghost locked as resolved and limited conversation to collaborators May 28, 2021

[wasm] Major performance regression serializing and deserializing json #50260

[wasm] Major performance regression serializing and deserializing json #50260

Comments

lewing commented Mar 25, 2021

ghost commented Mar 25, 2021

ghost commented Mar 25, 2021

lewing commented Mar 25, 2021

GrabYourPitchforks commented Mar 26, 2021

GrabYourPitchforks commented Mar 26, 2021

lewing commented Mar 26, 2021 • edited Loading

steveharter commented Mar 29, 2021 • edited Loading

pranavkm commented Mar 29, 2021

lewing commented Mar 29, 2021 • edited Loading

steveharter commented Mar 29, 2021 • edited Loading

GrabYourPitchforks commented Mar 31, 2021

GrabYourPitchforks commented Mar 31, 2021

pranavkm commented Mar 31, 2021

captainsafia commented Apr 6, 2021

GrabYourPitchforks commented Apr 6, 2021

vargaz commented Apr 8, 2021

danmoseley commented Apr 9, 2021

GrabYourPitchforks commented Apr 9, 2021

vargaz commented Apr 9, 2021

GrabYourPitchforks commented Apr 9, 2021

lewing commented Apr 14, 2021

radekdoulik commented Apr 15, 2021 • edited Loading

radekdoulik commented Apr 15, 2021 • edited Loading

radekdoulik commented Apr 15, 2021

jkotas commented Apr 15, 2021

pavelsavara commented Apr 15, 2021 • edited Loading

jkotas commented Apr 15, 2021 • edited Loading

radekdoulik commented Apr 15, 2021

pavelsavara commented Apr 15, 2021 • edited Loading

radekdoulik commented Apr 16, 2021

jkotas commented Apr 16, 2021

radekdoulik commented Apr 16, 2021

GrabYourPitchforks commented Apr 16, 2021 • edited Loading

lewing commented Apr 16, 2021 • edited Loading

GrabYourPitchforks commented Apr 17, 2021

jkotas commented Apr 17, 2021

lewing commented Apr 17, 2021

lewing commented Apr 17, 2021

GrabYourPitchforks commented Apr 17, 2021

pranavkm commented Apr 19, 2021

benaadams commented Apr 19, 2021

steveharter commented Apr 28, 2021

pranavkm commented Apr 28, 2021

lewing commented Mar 26, 2021 •

edited

Loading

steveharter commented Mar 29, 2021 •

edited

Loading

lewing commented Mar 29, 2021 •

edited

Loading

steveharter commented Mar 29, 2021 •

edited

Loading

radekdoulik commented Apr 15, 2021 •

edited

Loading

radekdoulik commented Apr 15, 2021 •

edited

Loading

pavelsavara commented Apr 15, 2021 •

edited

Loading

jkotas commented Apr 15, 2021 •

edited

Loading

pavelsavara commented Apr 15, 2021 •

edited

Loading

GrabYourPitchforks commented Apr 16, 2021 •

edited

Loading

lewing commented Apr 16, 2021 •

edited

Loading