Bulk: Add TimerWheel to Bulk to improve latency #1640

ealsur · 2020-06-18T21:00:22Z

Description

Current Bulk implementation is optimized for high throughput utilization, which is the ideal scenario for Bulk.

There are, however, scenarios where the volume of data might be dynamic in nature and not always high (compared with the provisioned RU/s). In these scenarios where the volume of data is low compared with the provisioned throughput, Bulk leverages an internal timer to dispatch the batches that don't get filled.

This internal timer is leveraging a construct called TimerPool, which is an implementation of an ordered timer list. The TimerPool has a minimum dispatch time of 1 second, so the internal timers we were using for Bulk could dispatch half-filled batches only on a 1 second intervals.

The effect was that low data volume usages had 1-2 second latency (see below in performance baselines).

This PR changes the TimerPool for a TimerWheel (introduced in #1612) with a 100ms dispatch time. The effect is that on low volume scenarios, the overall latency is improved up to 90%, while normal high volume scenarios are un-affected.

Why not use a TimerWheel with < 100ms? Because during performance benchmarking, using a lower amount (for example 50ms) was only helping in the case where the # of documents was less than the internal batch size (100), in all other scenarios, the performance ended up being worse than with 100ms because the batches were being dispatched too fast and wasn't letting them get filled.

Performance comparison

All comparisons were made using an Indexing Policy that does not index any field to increase throughput:

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [],
    "excludedPaths": [
        {
            "path": "/*"
        },
        {
            "path": "/\"_etag\"/?"
        }
    ]
}

All time measurements were calculated by running the scenario 10 times and obtaining the average.

Container with 100K RU/s provisioned, Item size 1KB

Baseline (with PooledTimer 1s)

Number of documents	Time to process (AVG)
10	1268ms
100	1265ms
500	1269ms
1000	1334ms
2000	1244ms
3000	1480ms
5000	1249ms
6000	1248ms
10000	1334ms
50000	4466ms
100000	8355ms
200000	15028ms

Change (with TimerWheel 100ms)

Number of documents	Time to process (AVG)
10	137ms
100	148ms
500	149ms
1000	180ms
2000	245ms
3000	325ms
5000	453ms
6000	523ms
10000	834ms
50000	3837ms
100000	7367ms
200000	15102ms

Container with 10K RU/s provisioned, Item size 1KB

Baseline (with PooledTimer 1s)

Number of documents	Time to process (AVG)
10	1266ms
100	1260ms
500	1234ms
1000	1236ms
2000	1231ms
3000	1230ms
5000	2515ms
6000	2803ms
10000	4472ms

Change (with TimerWheel 100ms)

Number of documents	Time to process (AVG)
10	131ms
100	131ms
500	150ms
1000	506ms
2000	979ms
3000	1229ms
5000	2184ms
6000	2659ms
10000	4469ms

Versus TimerPool

To verify that TimerWheel is not bringing more allocations into the picture, this is a comparison between a 1 sec TimerPool and 1 sec TimerWheel, creating 10000 timers and creating 1 timer:

Type of change

Bug fix (non-breaking change which fixes an issue)

j82w · 2020-06-19T11:29:55Z

Microsoft.Azure.Cosmos/src/Batch/BatchAsyncStreamer.cs

        private readonly int congestionDecreaseFactor = 5;
        private readonly int maxDegreeOfConcurrency;
+        private readonly TimerWheel timerWheel;


Can E2E performance tests be added to make it easy to detect regressions in this?

The issue here is where to put them. The Tools/Benchmark project is not the right place, and we don't have any E2E test projects.

We don't need to block on this. This can be done when the rest of the perf gates are done.

I think we need a new perf project and test project that are explicit contract validation tests.

Microsoft.Azure.Cosmos/src/Batch/BatchAsyncStreamer.cs

* using timerwheel * tests

ghost · 2021-12-15T18:04:25Z

Closing due to in-activity, pease feel free to re-open.

ealsur added 2 commits June 18, 2020 13:05

using timerwheel

0b36a6c

tests

5f7fd4a

ealsur added the Performance label Jun 18, 2020

ealsur requested review from bchong95, FabianMeiswinkel, j82w, khdang, kirankumarkolli, kirillg and sboshra as code owners June 18, 2020 21:00

ealsur self-assigned this Jun 18, 2020

ealsur changed the title ~~Performance: Add TimerWheel to Bulk to improve latency~~ Bulk: Add TimerWheel to Bulk to improve latency Jun 18, 2020

j82w reviewed Jun 19, 2020

View reviewed changes

ealsur added the Bulk label Jun 22, 2020

j82w approved these changes Jun 23, 2020

View reviewed changes

Merge branch 'master' into users/ealsur/bulkwheel

e5c0b60

ealsur merged commit 2a26cfe into master Jun 23, 2020

ealsur deleted the users/ealsur/bulkwheel branch June 23, 2020 19:56

kirankumarkolli pushed a commit that referenced this pull request Jul 11, 2020

Bulk: Add TimerWheel to Bulk to improve latency (#1640)

6f76ae6

* using timerwheel * tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bulk: Add TimerWheel to Bulk to improve latency #1640

Bulk: Add TimerWheel to Bulk to improve latency #1640

ealsur commented Jun 18, 2020 •

edited

Loading

j82w Jun 19, 2020

ealsur Jun 19, 2020

j82w Jun 22, 2020

j82w Jun 22, 2020

ghost commented Dec 15, 2021

Bulk: Add TimerWheel to Bulk to improve latency #1640

Bulk: Add TimerWheel to Bulk to improve latency #1640

Conversation

ealsur commented Jun 18, 2020 • edited Loading

Description

Performance comparison

Container with 100K RU/s provisioned, Item size 1KB

Container with 10K RU/s provisioned, Item size 1KB

Versus TimerPool

Type of change

j82w Jun 19, 2020

Choose a reason for hiding this comment

ealsur Jun 19, 2020

Choose a reason for hiding this comment

j82w Jun 22, 2020

Choose a reason for hiding this comment

j82w Jun 22, 2020

Choose a reason for hiding this comment

ghost commented Dec 15, 2021

ealsur commented Jun 18, 2020 •

edited

Loading