Server-side Blazor E2E performance and capacity testing #10449

danroth27 · 2019-05-22T15:27:34Z

Edit: @rynowak hijacking top post for great justice

Summary

We need to develop infrastructure and a plan for testing Server-Side Blazor's performance and capacity (scalability) to:

Find and fix performance problems
Find and fix reliability problems
Provide guidance to users for capacity planning

Workload

We should use a port of the Blazing Pizza app to drive our testing. This is a realistic sample of a small app, and it is a full end-to-end (includes data access, and some background work on the server). There's a good section of features represented here and we have to crawl before we can run.

Scenarios

Run these in our perf lab.

Performance

Run Blazing Pizza on the VM, and spin up enough clients to saturate the CPU or thrash the memory. Figure out which resource is the most scarce, and then optimize, and repeat.

I propose we automate a canonical scenario like placing a pizza order - then measure render operations per-second for that fixed test script.

Create a checked in automated performance test in our perf lab that reports render operations per-second.
Analyze results and log issues for improvement for some time-boxed amount of time. As with all perf work this is open ended so the size of this item is flexible.

Capacity and Reliability

Run Blazing Pizza on the VM, and spin up a bunch of clients, but have them run the test script slowly (with pauses), let's say one operation per-second. The goal is to simulate a bunch of users in actual use of the web site to try and get a realistic estimate of a user count, and what kinds of resources your run out of first.

Determine a baseline number for capacity in this scenario. How many client using our standard script will exhaust a resource on the server?
Analyze results and log issues for improvement for some time-boxed amount of time. Depending on the results of investigating the capacity, the priority will vary.
Create a checked in automated reliability test in our perf lab than can run for a long period of time. The number of client used for this should put the server at 75% capacity.
Analyze results for memory growth and reliability issues. The bar for this category is stricter, we need to address all issues that cause memory growth or reliability problems.

Security

To further capacity and reliability, we have to understand the characteristics of Blazor in the event a client performs malicious actions against the server. To do this, here are some baselines we need to understand:

The most number of clients a server can handle before it runs out of resources (memory / CPU / others?).
* [ ] Decide if server-side Blazor needs limits on how many events can be delivered and the max size for events #12003 - What happens when client events are raised faster than the server can process them? What is a reasonable limit to queued client events?
Similarly what should be the limit for incoming JS Interop calls.
~~[Blazor server-side] Limit the amount of queued pending renders #11964 - The server will queue up renders until the client acks~~
- ~~What is a reasonable limit here? What does it look like once we exhaust the queue.~~
- ~~Similarly if the client disconnects but we keep accumulating renders.~~

Issues

Techniques

TLDR we're writing a headless Blazor client in .NET.

There's an appealing low investment strategy here where we use selenium to automate headless browsers. This will be easy to accomplish because we're already using selenium for E2E tests. However this doesn't meet the goals because we're going to cap out at 20-30 browsers per client machine, and then be faced with the difficult challenge of coordinating multiple client machines. Additionally each operation in selenium invokes polling the DOM to see if it's changed. This makes any test executed in selenium slow - which means that we probably cannot do meaningful performance testing with a small number of clients.

Another approach with a slightly higher investment would be to write a test client using Node.js and a DOM library in ts/js. This would be more involved than selenium because we need to mock all of Blazor's interactions with the browser and DOM. This will be faster and scale better than selenium, and is an appealing option.

There's a higher investment strategy where we develop a custom test client for writing DOM-driven tests in .NET. This would require us writing a SignalR client that's capable of doing a Blazor handshake and then simulating the interface between the server and the client over the SignalR connection. The test client would still need to have some representation of the DOM (for verification, synchronization) and the set of current event handlers (for interaction). The biggest advantage here is that we're not using any of the Blazor client-side code - we can simulate a hostile client in many ways.

Of the choices the last two (Node and .NET clients) both meet the requirements, but the .NET client approach will allow us to write more kinds of tests, so it seems more valuable.

mkArtakMSFT · 2019-06-23T20:17:25Z

Moving this to Preview8 as the work is ongoing and will continue in the next milestone.

pranavkm · 2019-08-27T18:55:58Z

Tracking further work as part of #13483

danroth27 added task area-blazor Includes: Blazor, Razor Components labels May 22, 2019

mkArtakMSFT added this to the 3.0.0-preview8 milestone May 22, 2019

mkArtakMSFT assigned pranavkm May 22, 2019

mkArtakMSFT modified the milestones: 3.0.0-preview8, 3.0.0-preview7 May 23, 2019

danroth27 mentioned this issue May 30, 2019

Server-side Blazor Production and Reliability #10472

Closed

28 tasks

mkArtakMSFT added the cost: XL label Jun 7, 2019

mkArtakMSFT added the Working label Jun 17, 2019

mkArtakMSFT mentioned this issue Jun 17, 2019

Blazor change detection fires for a property that does not change #11248

Closed

rynowak mentioned this issue Jun 18, 2019

Blazor 3.0 Roadmap #8177

Closed

56 tasks

mkArtakMSFT mentioned this issue Jun 19, 2019

Blazor - control of repeated component #11261

Closed

mkArtakMSFT modified the milestones: 3.0.0-preview7, 3.0.0-preview8 Jun 23, 2019

mkArtakMSFT modified the milestones: 3.0.0-preview8, 3.0.0-preview9 Jul 24, 2019

pranavkm modified the milestones: 3.0.0-preview9, 3.1.0-preview1 Aug 22, 2019

pranavkm mentioned this issue Aug 27, 2019

Add perf and reliability benchmarks for Blazor #13483

Closed

4 tasks

pranavkm closed this as completed Aug 27, 2019

pranavkm removed the Working label Aug 27, 2019

mkArtakMSFT added the Done This issue has been fixed label Aug 30, 2019

mkArtakMSFT modified the milestones: 3.1.0-preview2, 3.1.0-preview1 Oct 1, 2019

ghost locked as resolved and limited conversation to collaborators Dec 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server-side Blazor E2E performance and capacity testing #10449

Server-side Blazor E2E performance and capacity testing #10449

danroth27 commented May 22, 2019 •

edited by pranavkm

Loading

mkArtakMSFT commented Jun 23, 2019

pranavkm commented Aug 27, 2019

Server-side Blazor E2E performance and capacity testing #10449

Server-side Blazor E2E performance and capacity testing #10449

Comments

danroth27 commented May 22, 2019 • edited by pranavkm Loading

Summary

Workload

Scenarios

Performance

Capacity and Reliability

Security

Issues

Techniques

mkArtakMSFT commented Jun 23, 2019

pranavkm commented Aug 27, 2019

danroth27 commented May 22, 2019 •

edited by pranavkm

Loading