Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk: Add TimerWheel to Bulk to improve latency #1640

Merged
merged 3 commits into from
Jun 23, 2020
Merged

Conversation

ealsur
Copy link
Member

@ealsur ealsur commented Jun 18, 2020

Description

Current Bulk implementation is optimized for high throughput utilization, which is the ideal scenario for Bulk.

There are, however, scenarios where the volume of data might be dynamic in nature and not always high (compared with the provisioned RU/s). In these scenarios where the volume of data is low compared with the provisioned throughput, Bulk leverages an internal timer to dispatch the batches that don't get filled.

This internal timer is leveraging a construct called TimerPool, which is an implementation of an ordered timer list. The TimerPool has a minimum dispatch time of 1 second, so the internal timers we were using for Bulk could dispatch half-filled batches only on a 1 second intervals.

The effect was that low data volume usages had 1-2 second latency (see below in performance baselines).

This PR changes the TimerPool for a TimerWheel (introduced in #1612) with a 100ms dispatch time. The effect is that on low volume scenarios, the overall latency is improved up to 90%, while normal high volume scenarios are un-affected.

Why not use a TimerWheel with < 100ms? Because during performance benchmarking, using a lower amount (for example 50ms) was only helping in the case where the # of documents was less than the internal batch size (100), in all other scenarios, the performance ended up being worse than with 100ms because the batches were being dispatched too fast and wasn't letting them get filled.

Performance comparison

All comparisons were made using an Indexing Policy that does not index any field to increase throughput:

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [],
    "excludedPaths": [
        {
            "path": "/*"
        },
        {
            "path": "/\"_etag\"/?"
        }
    ]
}

All time measurements were calculated by running the scenario 10 times and obtaining the average.

Container with 100K RU/s provisioned, Item size 1KB

Baseline (with PooledTimer 1s)

Number of documents Time to process (AVG)
10 1268ms
100 1265ms
500 1269ms
1000 1334ms
2000 1244ms
3000 1480ms
5000 1249ms
6000 1248ms
10000 1334ms
50000 4466ms
100000 8355ms
200000 15028ms

Change (with TimerWheel 100ms)

Number of documents Time to process (AVG)
10 137ms
100 148ms
500 149ms
1000 180ms
2000 245ms
3000 325ms
5000 453ms
6000 523ms
10000 834ms
50000 3837ms
100000 7367ms
200000 15102ms

Container with 10K RU/s provisioned, Item size 1KB

Baseline (with PooledTimer 1s)

Number of documents Time to process (AVG)
10 1266ms
100 1260ms
500 1234ms
1000 1236ms
2000 1231ms
3000 1230ms
5000 2515ms
6000 2803ms
10000 4472ms

Change (with TimerWheel 100ms)

Number of documents Time to process (AVG)
10 131ms
100 131ms
500 150ms
1000 506ms
2000 979ms
3000 1229ms
5000 2184ms
6000 2659ms
10000 4469ms

Versus TimerPool

To verify that TimerWheel is not bringing more allocations into the picture, this is a comparison between a 1 sec TimerPool and 1 sec TimerWheel, creating 10000 timers and creating 1 timer:

image

Type of change

  • Bug fix (non-breaking change which fixes an issue)

@ealsur ealsur self-assigned this Jun 18, 2020
@ealsur ealsur changed the title Performance: Add TimerWheel to Bulk to improve latency Bulk: Add TimerWheel to Bulk to improve latency Jun 18, 2020
private readonly int congestionDecreaseFactor = 5;
private readonly int maxDegreeOfConcurrency;
private readonly TimerWheel timerWheel;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can E2E performance tests be added to make it easy to detect regressions in this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue here is where to put them. The Tools/Benchmark project is not the right place, and we don't have any E2E test projects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to block on this. This can be done when the rest of the perf gates are done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a new perf project and test project that are explicit contract validation tests.

@ealsur ealsur added the Bulk label Jun 22, 2020
@ealsur ealsur merged commit 2a26cfe into master Jun 23, 2020
@ealsur ealsur deleted the users/ealsur/bulkwheel branch June 23, 2020 19:56
kirankumarkolli pushed a commit that referenced this pull request Jul 11, 2020
@ghost
Copy link

ghost commented Dec 15, 2021

Closing due to in-activity, pease feel free to re-open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants